These are various lists of words extracted from Wiktionary data dumps. Some of the code
used to produce them is available here.
Of course, all these lists undoubtedly contain errors because Wiktionary contains errors.
You can do whatever you like with them, subject to
Wiktionary's licensing, where applicable.
WORD PART_OF_SPEECH DEFINITION
on each line (note the two spaces between word and part of speech).PART_OF_SPEECH
is one of the following:
%adjective
(e.g. unbelievable)%noun
(e.g. belief)%noun.proper
(e.g. France)%verb
(e.g. believe)%adverb
(e.g. unbelievably)%interjection
(e.g. yowza)%particle
(e.g. O)%conjunction
(e.g. unless)%preposition
(e.g. into)%determiner
(e.g. the)%pronoun
(e.g. yourself)%contraction
(e.g. woulda)%number
(e.g. 2, twenty-seven)%phrase
(e.g. you'd better believe it)%phrase.prepositional
(e.g. beyond belief)%phrase.proverb
(e.g. seeing is believing)%affix
(e.g. 🅱, a “simulfix”)%affix.prefix
(e.g. un-)%affix.suffix
(e.g. -ism)%affix.infix
(e.g. -fuckin-)%affix.circumfix
(e.g. a- -ing)%affix.interfix
(rare, e.g. -retin-)%symbol
(e.g. ℞)%symbol.punctuation
(e.g. …)%symbol.letter
(e.g. b)%symbol.diacritic
(e.g. ◌́)%unknown
— couldn’t be determined/none of the aboveDEFINITION
is in the wikitext format.¹ Derived from enwiktionary-20250701 dump.