Why Hurricane Is Still Relevant When You Can Try To Build Your Own Classification Engine With GPT-4
Why your DIY AI classification tool will cost you £500k and still fail.
In the last 18 months, our sales team and I have had the same conversation at least a dozen times. A CTO or head of engineering at a logistics company or online retailer explains their plan, “Why pay for Hurricane when we can build our own classification engine with GPT-4?” The pitch is elegant, the business case looks solid, the engineering team is confident.
I now know exactly how this story ends.
It ends 14 to 18 months later with £500k spent, a classification system that cannot pass a compliance audit, and a sheepish conversation about Hurricane contract terms. Sometimes the CTO is still there for that conversation, often they are not.
The LLM promise versus the customs reality
Large Language Models are extraordinary at generating fluent text. They predict likely words based on patterns from the public internet. For writing product blurbs or summarising articles, this can be brilliant.
For customs classification, it is a liability.
LLMs guess. They do not reason with verified trade databases, they pattern match against whatever they ingested, a forum post from 2019, a cached PDF from a freight forwarder’s blog, an outdated duty rate table. Ask an LLM for an HS code and you get something that looks convincing, yet it is often wrong in ways that cost money.


A matching example, AI generated content may be incorrect.
Cost comparison at scale
| Item | GPT-4 style classifier | Hurricane |
|---|---|---|
| Per query cost | £0.10 to £0.20 | £0.0400 |
| Daily volume | 20,000 | 20,000 |
| Annual queries | 7,300,000 | 7,300,000 |
| Estimated annual cost | £730,500 to £1,460,000 | £292,000 |
| Included features | N/A | Legal citations, audit trails |
| Potential saving versus GPT-4 | N/A | £438,500 to £1,168,000, 60 to 80 percent |
Accuracy and knowledge currency
Accuracy. Generic web trained LLMs typically reach about 70 percent accuracy on HS code classification in controlled conditions. Hurricane, powered by 340 million verified customs declarations, achieves 98 percent accuracy in live production with explainability, audit trails, and legal references for every decision.
Knowledge currency. LLM knowledge cut-offs are months or years old. Tariffs change quarterly, sometimes monthly. Hurricane ingests tariff updates within 24 hours of publication from more than 200 customs authorities globally.
Why “we will build it ourselves” fails
Months 1 to 3. A quick demo works on 20 test cases. Budgets get approved.
Months 4 to 8. You discover you need structured customs data, duty rates, agreements, and restrictions. Scope triples, liability grows.
Months 9 to 14. You build pipelines and patch LLM guesses with brittle rules. The codebase balloons, ownership fades.
Months 15 to 18. Tariff maintenance becomes a full time job. Latency is too slow for checkout. Costs spike.
Months 19 to 22. Test audits show 18 to 24 percent error rates, often “close” yet still costly.
Month 23. You call Hurricane. Month 24. Project cancelled, vendor selected.
What you are actually up against
Hurricane has spent 10 years building direct data relationships and a compliance logic engine that mirrors how customs officers decide. We do not guess, we calculate, then we cite. Every classification includes the HS code to full specificity, duty rates, legal citations, restricted goods flags, a full audit trail, and country notes.
The real cost of DIY
- Engineering team, ~£590k annually. Senior developers, ML, DevOps, external compliance.
- API costs at scale, £220k to £730k per year. Even with caching and prompt work.
- Data acquisition and maintenance, £180k to £300k per year. Licences, manual quarterly updates, monitoring.
- Infrastructure, £60k to £100k per year. High availability, storage, CDN, security certification.
- Timeline, 30 to 42 months to production readiness. Then you are out of date again.
Total cost over 3 years. £4.7 million to £5.8 million. Hurricane over 3 years. ~£876k to £950k for the same volume, deployed in six weeks, with guarantees.
This is a moat problem, not a model problem
Data access, relationships, and institutional knowledge are the moat. Direct feeds from 170 plus customs authorities, partnerships that signal changes early, a compliance team that interprets rules before they land, 340 million real world classifications, and ex-customs expertise that has seen every edge case.
You cannot scrape this, you cannot prompt engineer your way to it, you cannot buy a decade of lived compliance logic.
Stop solving solved problems
You do not build your own payment processor, you use Stripe. You do not build your own email infrastructure, you use SendGrid. You do not build your own certificate authority, you use Let’s Encrypt. You should not build your own customs classification engine.
Two choices
- DIY. Spend 24 to 36 months and £1.5 million to launch something slower, less accurate, non-compliant, and expensive to maintain.
- Hurricane. Get 98.7 percent accuracy, sub-second responses in the low hundreds of milliseconds, full audit trails, and guaranteed compliance. Implement in six weeks, then focus your engineers on what differentiates your business.
Note, these numbers illustrate a typical retailer, logistics provider, or customs broker. Build your own ROI with your specific volumes and costs.



