LEMMA-IS(1)

Icelandic Lemmatization API

NAME

lemma-is - Icelandic word lemmatization service

SYNOPSIS

curl -X POST https://lemma.solberg.is/api/lemmatize \
  -H "Content-Type: application/json" \
  -d '{"word":"börnin"}'

DESCRIPTION

A free public API for Icelandic lemmatization. Maps inflected Icelandic word forms to their base forms (lemmas). Useful for search indexing, text analysis, and NLP tasks.

Powered by lemma-is npm library.

ENDPOINTS

MethodPathDescription
GET/This documentation
GET/healthHealth check
POST/api/lemmatizeSingle word → lemmas
POST/api/lemmatize/morphWord → lemmas with morphology
POST/api/textText → unique lemmas
POST/api/batchMultiple words (up to 1000)

EXAMPLES

Single word:

curl -X POST https://lemma.solberg.is/api/lemmatize \
  -H "Content-Type: application/json" \
  -d '{"word":"börnin"}'

# Response: {"word":"börnin","lemmas":["barn"]}

With morphological info:

curl -X POST https://lemma.solberg.is/api/lemmatize/morph \
  -H "Content-Type: application/json" \
  -d '{"word":"hestinum"}'

# Response: {"word":"hestinum","lemmas":[{"lemma":"hestur","category":"no","gender":"kk","case":"þgf","number":"et"}]}

Text indexing:

curl -X POST https://lemma.solberg.is/api/text \
  -H "Content-Type: application/json" \
  -d '{"text":"Börnin fóru í skólann"}'

# Response: {"original":"Börnin fóru í skólann","lemmas":["barn","fara","fóra","skóli","í"]}

Batch processing:

curl -X POST https://lemma.solberg.is/api/batch \
  -H "Content-Type: application/json" \
  -d '{"words":["hestur","hestinn","hestinum"]}'

# Response: {"results":[{"word":"hestur","lemmas":["hestur"]},{"word":"hestinn","lemmas":["hestur"]},{"word":"hestinum","lemmas":["hestur"]}]}

MORPHOLOGICAL CODES

CodeMeaningEnglish
Case (fall)
nfnefnifallnominative
þfþolfallaccusative
þgfþágufalldative
efeignarfallgenitive
Gender (kyn)
kkkarlkynmasculine
kvkkvenkynfeminine
hkhvorugkynneuter
Number (tala)
eteintalasingular
ftfleirtalaplural

LIMITS

LimitValue
Word length100 characters
Text length50,000 characters
Batch size1,000 words
Rate limit100 requests/minute per IP

API Key: To bypass rate limiting, include X-API-Key: <key> header.

NOTES

Disambiguation: The /api/text endpoint uses context-aware disambiguation. For example, “u00e1” is resolved to the verb “eiga” (to own) when preceded by a subject noun, or kept as a preposition when followed by a dative noun. This uses grammar rules, bigram frequencies, and suffix-based case inference.

Stopwords: The API does not filter stopwords — all lemmas are returned. The underlying lemma-is npm library supports contextual stopword removal (e.g., “u00e1” as preposition = stopword, “u00e1” as noun “river” = kept). Use the library directly with removeStopwords: true and useContextualStopwords: true for search indexing.

SEE ALSO

github.com/jokull/lemma-is - npm library

BÍN - Beygingarlýsing íslensks nútímamáls

AUTHOR

Jökull Sólberg <jokull@solberg.is>