Technical· 11 min de lecture

The architecture of an artificial-intelligence project: three patterns and the discipline that makes them durable

The architecture of an artificial-intelligence project: three patterns and the discipline that makes them durable

Note revised on 25 May 2026. Article originally published in March 2026 — full rewrite.

The architecture of an artificial-intelligence project designates the set of structuring technical choices: which model is used, under which mode of access, how the data is managed, how the components are orchestrated. These choices durably condition the cost, performance and scalability of the result. A bad technical start costs months of delay and tens of thousands of francs; a good start is not seen, because it allows the serene evolution of the project over several years.

This note sets out the three architecture patterns that structure practice in 2026, qualifies the criteria distinguishing their appropriate use, and identifies the discipline that makes these choices durable beyond the initial start.

Three architecture patterns that structure practice

The apparent diversity of artificial-intelligence projects comes down, in practice, to three fundamental architecture patterns. Understanding what distinguishes these three patterns, and knowing which one corresponds to a given use case, constitutes the most structuring technical decision of a project.

The API Wrapper pattern: direct access to models through a programmatic interface

The simplest pattern consists of calling a language model through its programmatic interface, and building the application's business logic around that call. No dedicated infrastructure to manage, no specialised hardware to provision. Operational cost is proportional to actual usage.

This pattern suits rapid prototyping, moderate volumes, and use cases that do not require fine knowledge of company-specific data: summarisation of generic texts, simple classification, generation of standard content. An SME can reach a functional prototype in a few days with this approach.

Its main limitation lies in dependence on a single provider. The model market evolves rapidly, and price-performance trade-offs shift regularly. Building from the outset an abstraction layer that isolates the business logic from the model provider preserves the freedom to change provider without rewriting the application. This discipline — often referred to as the Provider Pattern — is not a luxury; it is operational insurance whose initial cost is marginal and whose value is revealed at the first provider change.

The RAG pattern: anchoring responses in company-specific data

The RAG pattern — for Retrieval-Augmented Generation — has become the standard for enterprise projects requiring knowledge of the organisation's own data. Instead of injecting everything into the prompt sent to the model, a semantic-search engine is combined with a language model.

The operation breaks down into three steps. First, indexing: the company's documents — manuals, internal procedures, knowledge base, product catalogue, legal corpus — are split into segments and converted into vector representations that capture their semantic meaning. Next, search: when a user asks a question, the system identifies the most relevant segments by vector similarity, rather than by simple keyword matching. Finally, generation: the model receives the question together with the relevant segments, and produces a contextualised response that can cite its sources.

This pattern responds to the central challenge of language models used alone: hallucinations. By anchoring each response in verified documents, the system remains factual. When the knowledge base does not contain the answer to a question, the system can say so explicitly rather than invent.

Its quality, however, depends directly on the quality of the indexed data. Documents that are obsolete, poorly structured, contradictory or incomplete produce mediocre responses, whatever model is used. The data-preparation work — often neglected in the commercial discourse around the tools — represents, in the practice observed at the firm on MCVA projects, a substantial share of the total effort — typically between one half and two thirds on serious projects. This observation deserves to be posed at framing, because it substantially modifies the effort estimate.

The agents pattern: delegating execution of multi-step tasks

The agents pattern represents the next level of complexity. An agent is a model-driven system capable of using tools — web search, calculations, API calls, database queries — and of planning a sequence of actions to accomplish a complex task.

This pattern suits tasks requiring several coordinated steps: information research, intermediate analysis, decision on what to do next, execution of an action in a business system. The end-to-end automation of complex workflows — checking an order status in an existing management system followed by an update in another, for example — constitutes the typical use case.

The operational advantage is substantial when the pattern is appropriate: end-to-end automation of business processes that classically required several human interventions. Its limitation lies in implementation complexity — debugging a multi-step agent is demanding — and in high operational cost: an agent can make several dozen model calls to handle a single complex task. This pattern should be reserved for high-value use cases where the complexity justifies the operating and maintenance cost.

The structuring discipline: data quality prevails over model choice

The most frequent error observable on artificial-intelligence projects consists of concentrating attention on the choice of model, while the quality of the data disproportionately determines the quality of the final result.

Three observations support this qualification.

First, on projects relying on company-specific data — the majority of serious projects — the quality difference between the leading available models is relatively modest compared with the difference produced by the quality of the indexed data. A RAG system fed with clean and structured documents on a model of average capacity regularly produces better results than a RAG system fed with disorganised documents on a leading model.

Next, investment in data quality has a cumulative and durable character. Data cleaned and structured once serves all subsequent uses. The choice of a model, by contrast, must be reconsidered regularly because the market evolves. Investing effort in data rather than in model selection therefore produces more durable value.

Finally, observable practice shows that the projects that fail share a common characteristic: they underestimated the data-preparation effort, and they discovered it too late. The projects that succeed invest from the initial framing in this preparation, and they accompany deployment with data-maintenance processes that preserve quality over time.

Five recurring errors that practice observes

Beyond the choice of pattern and attention to data, five recurring errors deserve to be identified because they produce substantial opportunity costs.

Starting with the supplementary training of a model. Supplementary training — often referred to as fine-tuning — is rarely necessary first off. In the great majority of cases, a RAG system with well-built prompts produces equivalent results for a fraction of the cost and effort. Supplementary training is relevant in specific situations where the model must adopt a very particular style or master a technical vocabulary the RAG does not cover correctly. Before investing in this path, it is generally productive to push further the optimisation of the RAG.

Ignoring data quality. The point was developed above, but it deserves to be recalled because it probably constitutes the most structuring error.

Neglecting systematic evaluation. Without quality metrics — relevance, fidelity to sources, completeness — it is impossible to measure progress objectively. Putting in place an evaluation framework from project start, with a reference set of a few dozen human-validated question-answer pairs, constitutes the operational investment with the highest return.

Underestimating operational costs in production. A prototype consuming a moderate budget in the exploratory phase can reach a substantially larger budget once moved into actual production. The forecast calculation based on the expected number of users, the frequency of use and the average request size deserves to be conducted at framing, rather than discovered in operation.

Building a monolithic system. A system that mixes in a single inseparable block the search engine, the language model, the cache, the user interface and the evaluation layer is difficult to evolve and debug. A modular architecture, where each component is independent, allows replacing one element without touching the others. This modularity does not require substantial effort at start, but it preserves the system's scalability over time.

For a Swiss SME starting an artificial-intelligence project, a progressive approach distinguishes the projects that succeed from those that get stuck.

The exploratory phase first, conducted with an API Wrapper pattern on a precise and measurable use case. This phase, which typically takes a few weeks, validates that the concept holds up in the company's specific context. Its modesty is its strength: a contained investment, a limited risk, immediate learnings.

The enriched prototype next, generally with a RAG pattern on the company's own data, accompanied by systematic evaluation of quality and first user tests. This phase, which runs over a few months, adjusts the system to real conditions of use and identifies the adjustments necessary before deployment.

Deployment with monitoring completes the sequence. This phase deploys the system in its target use context, with user-feedback processes, continuous prompt improvement and data maintenance that preserve quality over time.

The evaluation of a possible supplementary training of the model, or of the addition of agents to automate complex workflows, intervenes only after this deployment, on the basis of real observations rather than hypotheses. This discipline of waiting distinguishes the projects that consolidate operational value from those that pile up layers of complexity without tangible benefit.

Sovereignty over processed data

For projects touching data subject to the Federal Act on Data Protection[1] or to specific sector obligations, three approaches structure practice, in order of increasing constraint.

Contractual non-retention commitments first, negotiable with the leading model providers for their enterprise plans. This approach suits the majority of cases where the data is sensitive without being strategically critical.

Hosting on European or Swiss infrastructure next, which guarantees that the data does not leave the jurisdiction concerned. This approach suits contexts where localisation is an explicit requirement, either regulatory or contractual.

Deployment on controlled infrastructure completes the list, generally with open models. This approach guarantees that the data remains entirely on the company's infrastructure, at the cost of internal technical competence and hardware investments that can be substantial.

For companies subject to strict sector regulations — financial sector, health, public administration — a formal risk analysis conducted at the initial framing of the project avoids costly trade-offs midway.

The discipline that distinguishes seriousness

The choice of an architecture for an artificial-intelligence project is not a matter of sharp expertise inaccessible to non-technical decision-makers. It is a matter of methodical discipline: pose the appropriate pattern for the use case, invest in data quality before model choice, build in from the outset the conditions for system scalability, treat the question of sovereignty explicitly rather than elude it.

This discipline distinguishes, once again, the projects that produce durable value from those that consume budget without effectively transforming the functioning of the organisation. It does not require any particular technical genius. It requires sustained attention to the professional rigour of the initial trade-offs.

On the RAG pattern in particular, and on the conversational architectures that use it as a foundation, see also the note Chatbots driven by generative models: from scripted FAQ to agents that act.

Sources

[1] Federal Act on Data Protection (FADP), revision of 25 September 2020, in force since 1 September 2023. www.fedlex.admin.ch/eli/cc/2022/491/en []


Jérôme Deshaie is CEO of MCVA Consulting SA, a Swiss firm specialising in strategic consulting on artificial intelligence, based in Valais.

Related articles