eCommerceNews New Zealand - Technology news for digital commerce decision-makers
Atturra

Inference-as-a-service is the secret sauce behind a new breed of AI companies

Thu, 26th Feb 2026

The number of AI companies in Australia and New Zealand is booming. The problem is: so are their costs.

Government figures show that there were at least 1533 AI companies in Australia by mid-2025. To put that in context, just six months before, the government put the number of AI companies headquartered in Australia at around 650.

This is not only an astronomical leap in the number of companies making AI products and services, but also in the collective capability being produced, which manifests as productivity benefits for organisations and for the broader economy. 

Of course, it is not just SMEs that are buying and consuming AI services; the vast majority of AI companies being set up in Australia and New Zealand are SMEs themselves. 

While some may go on to achieve success and even unicorn status, as an SME they face a common challenge: though they have typically created their businesses by being innovative, they are perennially resource-constrained, and this influences both their decision-making and what they can do.

For Australia to continue to produce world-leading AI companies, there needs to be a set of conditions and services tailor-made for them, that allow them to host and run AI products and workloads efficiently and cost-effectively.

Access to cost-effective cloud compute is fundamental to the ongoing emergence and viability of AI companies.

AI workloads are resource-intensive, and AI companies increasingly face scrutiny over their level of investment and operational expenses. The costs being incurred by large overseas vendors investing in AI solutions has been a cause of recent stock market jitters. 

Many larger AI companies use hyperscale cloud services to power their AI products and services. But for smaller, up-and-coming, SME-sized AI companies, this expense can be too much to even contemplate.

This is driving Australian and New Zealand SMEs with AI product ambitions to investigate alternatives. 

An emerging option is to adopt inference-as-a-service, which SMEs in Australia and New Zealand are already taking advantage of to power their AI products and services cost-effectively.

The two main types

Inference-as-a-service takes two predominant forms. The most common of these is infrastructure-as-a-service, as some combination of the GPUs, CPUs, RAM or data storage needed to run an AI workload. 

This infrastructure can be cost-effectively hosted in a private cloud or even on-premises under a managed services arrangement. The AI company implements, runs and manages its AI application or model on top of this infrastructure, including data ingress.

The second type of inference-as-a-service sees a service provider host the necessary IT infrastructure to power the AI product or workload, as well the AI application or model itself. This could appear to the AI company as a private large language model (LLM) that they can utilise when they want in a secure way, or as a machine learning model or algorithm designed to understand, classify and analyse unstructured data to find specific patterns or insights. The AI company is responsible only for bringing data to the inference-as-a-service for processing and analysis.

The advantage in both examples is that it's significantly more cost-effective to host AI infrastructure in a private cloud or on-premises as a managed service, compared to a hyperscale cloud. In one recent project with Atturra, standing up the infrastructure needed to run and host an AI model came in at between one-quarter to one-third of the cost of buying the equivalent capacity from a hyperscaler. 

For an AI company to scale and create a competitive value proposition today, the cost of GPUs is significant. An inference-as-a-service provider is fundamental to making AI-based products and businesses viable.

Inference in action at iTronics and more

New Zealand's iTronics Group is an example of an SME with ambitions to become an AI company being able to cost-effectively do so with an industry-first inference-as-a-service solution.

The company wanted to use AI to help its customers manage security and surveillance camera feeds from multiple sites centrally, analysing the high-resolution video contents in real-time to raise alerts on suspicious or unsafe activities. This required cloud, networking and GPU infrastructure.

Atturra and iTronics co-created an inference-as-a-service capability to support iTronics' use case, together with a commercial model that balanced flexibility, scalability, and cost efficiency for both parties. The inference-as-a-service is fundamental to the ongoing viability of iTronics' AI business model.

Another example of an inference-as-a-service is a Nuix service offering from Atturra that can help companies supercharge investigations and overcome common data challenges. Governments and corporations of all sizes that are facing more regulatory pressures and tighter deadlines. These organisations are being asked to analyse large amounts of unstructured data through information requests and various fraud investigations.

AI enrichment, powered by inference-as-a-service, can significantly help get the answers they need faster and with more accuracy. By using AI to analyse text, chat messages, emails, images and video and other unstructured data, companies get answers faster, which allows them to take meaningful actions with more immediate effect, such as isolating a fraud case, and collecting evidence that can be used to commence legal proceedings. 

In conclusion, inference-as-a-service is a growing core component for delivering AI products and experiences. As more companies use it to become AI companies in their own right, this will mean more innovative AI products and services can be brought to market, where they can benefit all organisations and drive efficiency and productivity gains.