Skip to content

Term Base

Terminology consistency across all content. Pauhu's term base ensures domain-specific terms are translated consistently across all projects, languages, and time periods.


What is a Term Base?

A term base (terminology database) stores approved translations for specific terms:

Source Term (EN) Target Term (FI) Domain Reliability
artificial intelligence tekoäly 36 Science ⭐⭐⭐⭐
GDPR tietosuoja-asetus 12 Law ⭐⭐⭐⭐
encryption salaus 32 IT ⭐⭐⭐⭐
quantum-safe kvanttiturvallinen 36 Science ⭐⭐⭐

Automatic Term Recognition

from pauhu import Pauhu

client = Pauhu()

# Translate with term base enforcement
result = client.translate(
    text="Our AI uses GDPR-compliant encryption.",
    source="en",
    target="fi",
    domain="12 Law",
    enforce_terms=True  # Enforce term base
)

print(result.text)
# "Tekoälymme käyttää tietosuoja-asetuksen mukaista salausta."

# Check which terms were recognized
print(result.terms_applied)
# [
#   {"source": "AI", "target": "tekoäly", "confidence": 0.98},
#   {"source": "GDPR", "target": "tietosuoja-asetus", "confidence": 1.0},
#   {"source": "encryption", "target": "salaus", "confidence": 0.95}
# ]

Term Base Sources

1. IATE (EU InterActive Terminology)

8.4 million terms from EU institutions:

# IATE terms are automatically available
result = client.translate(
    text="The European Commission adopted a directive.",
    source="en",
    target="fi",
    domain="10 European Union"
)

# IATE terms automatically applied:
# "European Commission" → "Euroopan komissio" (⭐⭐⭐⭐)
# "directive" → "direktiivi" (⭐⭐⭐⭐)

2. EuroVoc (EU Multilingual Thesaurus)

7,000+ domain concepts across 21 EuroVoc domains:

Domain Terms Languages
04 Politics 850+ 24
10 European Union 1,200+ 24
12 Law 1,500+ 24
36 Science 900+ 24

3. Custom Term Bases

Upload your own terminology:

# Import organization-specific terms
project = client.projects.create(name="Legal Docs Q1 2025")

project.terms.import_csv(
    file_path="./organization-terms.csv",
    format="csv",
    columns=["source", "target", "domain", "reliability"]
)

CSV Format:

source,target,domain,reliability
data controller,rekisterinpitäjä,12 Law,4
data processor,henkilötietojen käsittelijä,12 Law,4
lawful basis,oikeusperuste,12 Law,4


Term Base Priority

When multiple term bases contain the same term:

Priority (highest to lowest):
1. Custom Project Term Base    (your organization)
2. IATE (EU official)          (⭐⭐⭐⭐ reliability)
3. EuroVoc                     (EU multilingual thesaurus)
4. Domain-specific glossaries  (per-domain terms)
5. AI-generated suggestions    (see AI Term Base)

Multilingual Term Bases

Create multilingual term bases for EU projects:

# Define term in all 24 EU languages
project.terms.add(
    source_term="artificial intelligence",
    translations={
        "fi": "tekoäly",
        "sv": "artificiell intelligens",
        "de": "künstliche Intelligenz",
        "fr": "intelligence artificielle",
        "es": "inteligencia artificial",
        # ... 19 more languages
    },
    domain="36 Science",
    reliability=4
)

Term Base Integration with AI

Bidirectional semantic flow:

graph LR
    A[Term Base] -->|Enforce consistency| B[AI Translation]
    B -->|Extract new terms| C[AI Term Base]
    C -->|Suggest additions| A
    A -->|Context| D[Translation Memory]
    D -->|Historical usage| A

See AI Term Base for AI-powered terminology extraction.


Quality Enforcement

# Strict term base enforcement
result = client.translate(
    text="AI processes personal data under GDPR.",
    target="fi",
    domain="12 Law",
    enforce_terms="strict"  # Fail if terms not found
)

# Permissive (suggest but don't enforce)
result = client.translate(
    text="Emerging AI concepts like AGI.",
    target="fi",
    domain="36 Science",
    enforce_terms="suggest"  # Suggest matches, allow alternatives
)

Compliance

ISO 17100:2015

"Terminology resources shall be used and maintained throughout the translation process."

Pauhu's term base satisfies ISO 17100 requirements for terminology management.

GDPR Article 32

Standard terminology ensures: - Consistent privacy notices across languages - Legally accurate translations of data protection terms - No ambiguity in user rights


Export and Backup

# Export term base (TBX format)
project.terms.export(
    file_path="./term-base-backup.tbx",
    format="tbx"  # TermBase eXchange (ISO 30042)
)

# Also supports:
# - CSV (simple tabular)
# - XLIFF (translation interchange)
# - TMX (translation memory exchange)

Getting Started

from pauhu import Pauhu

client = Pauhu()

# Create project with term base
project = client.projects.create(
    name="EU Regulation Translation",
    domain="10 European Union"
)

# IATE terms automatically available
result = project.translate(
    text="The regulation enters into force.",
    source="en",
    target="fi"
)

# Check applied terms
for term in result.terms_applied:
    print(f"{term.source}{term.target} (⭐×{term.reliability})")

Further Reading