Icelandic Lemmatization API
lemma-is - Icelandic word lemmatization service
curl -X POST https://lemma.solberg.is/api/lemmatize \
-H "Content-Type: application/json" \
-d '{"word":"börnin"}'
A free public API for Icelandic lemmatization. Maps inflected Icelandic word forms to their base forms (lemmas). Useful for search indexing, text analysis, and NLP tasks.
Powered by lemma-is npm library.
| Method | Path | Description |
|---|---|---|
| GET | / | This documentation |
| GET | /health | Health check |
| POST | /api/lemmatize | Single word → lemmas |
| POST | /api/lemmatize/morph | Word → lemmas with morphology |
| POST | /api/text | Text → unique lemmas |
| POST | /api/batch | Multiple words (up to 1000) |
Single word:
curl -X POST https://lemma.solberg.is/api/lemmatize \
-H "Content-Type: application/json" \
-d '{"word":"börnin"}'
# Response: {"word":"börnin","lemmas":["barn"]}
With morphological info:
curl -X POST https://lemma.solberg.is/api/lemmatize/morph \
-H "Content-Type: application/json" \
-d '{"word":"hestinum"}'
# Response: {"word":"hestinum","lemmas":[{"lemma":"hestur","category":"no","gender":"kk","case":"þgf","number":"et"}]}
Text indexing:
curl -X POST https://lemma.solberg.is/api/text \
-H "Content-Type: application/json" \
-d '{"text":"Börnin fóru í skólann"}'
# Response: {"original":"Börnin fóru í skólann","lemmas":["barn","fara","fóra","skóli","í"]}
Batch processing:
curl -X POST https://lemma.solberg.is/api/batch \
-H "Content-Type: application/json" \
-d '{"words":["hestur","hestinn","hestinum"]}'
# Response: {"results":[{"word":"hestur","lemmas":["hestur"]},{"word":"hestinn","lemmas":["hestur"]},{"word":"hestinum","lemmas":["hestur"]}]}
| Code | Meaning | English |
|---|---|---|
| Case (fall) | ||
| nf | nefnifall | nominative |
| þf | þolfall | accusative |
| þgf | þágufall | dative |
| ef | eignarfall | genitive |
| Gender (kyn) | ||
| kk | karlkyn | masculine |
| kvk | kvenkyn | feminine |
| hk | hvorugkyn | neuter |
| Number (tala) | ||
| et | eintala | singular |
| ft | fleirtala | plural |
| Limit | Value |
|---|---|
| Word length | 100 characters |
| Text length | 50,000 characters |
| Batch size | 1,000 words |
| Rate limit | 100 requests/minute per IP |
API Key: To bypass rate limiting, include X-API-Key: <key> header.
Disambiguation: The /api/text endpoint uses context-aware
disambiguation. For example, “u00e1” is resolved to the verb
“eiga” (to own) when preceded by a subject noun, or kept as a preposition
when followed by a dative noun. This uses grammar rules, bigram frequencies,
and suffix-based case inference.
Stopwords: The API does not filter stopwords —
all lemmas are returned. The underlying lemma-is npm library supports
contextual stopword removal (e.g., “u00e1” as preposition = stopword,
“u00e1” as noun “river” = kept). Use the library directly
with removeStopwords: true and useContextualStopwords: true
for search indexing.
github.com/jokull/lemma-is - npm library
BÍN - Beygingarlýsing íslensks nútímamáls
Jökull Sólberg <jokull@solberg.is>