TAGS :Viewed: 7 - Published at: a few seconds ago

[ Python NLP: TypeError: not all arguments converted during string formatting ]

I tried the code on "Natural language processing with python", but a type error occurred.

import nltk
from nltk.corpus import brown

suffix_fdist = nltk.FreqDist()
for word in brown.words():
    word = word.lower()
    suffix_fdist.inc(word[-1:])
    suffix_fdist.inc(word[-2:])
    suffix_fdist.inc(word[-3:])
common_suffixes = suffix_fdist.items()[:100]

def pos_features(word):
    features = {}
    for suffix in common_suffixes:
        features['endswith(%s)' % suffix] = word.lower().endswith(suffix)
    return features
pos_features('people')

the error is below:

Traceback (most recent call last):
  File "/home/wanglan/javadevelop/TestPython/src/FirstModule.py", line 323, in <module>
    pos_features('people')
  File "/home/wanglan/javadevelop/TestPython/src/FirstModule.py", line 321, in pos_features
    features['endswith(%s)' % suffix] = word.lower().endswith(suffix)
TypeError: not all arguments converted during string formatting

Does anyone could help me find out where i am wrong?

Answer 1


suffix is a tuple, because .items() returns (key,value) tuples. When you use %, if the right hand side is a tuple, the values will be unpacked and substituted for each % format in order. The error you get is complaining that the tuple has more entries than % formats.

You probably want just the key (the actual suffix), in which case you should use suffix[0], or .keys() to only retrieve the dictionary keys.