TAGS :Viewed: 5 - Published at: a few seconds ago

[ Pythonic Way to compare two unordered lists by attributes ]

what is the most pythonic way to compare two unordered lists by one or more of their attributes? I would love to know if there is a pythonic way to find out if for each item in a list A there exists an item in list B where the item from list A and the item in list B match in a specified attribute.

In my example case, I have two .zip files in a unit test, and want to test, if the files match, but I am really looking for a good general solution for my personal toolset. This was my first attempt:

with ZipFile('A.zip') as old:
with ZipFile('B.zip') as new:
oldFileInfo = old.infolist()

allFound = True
for info in new.infolist():
   matches = [item for item in oldFileInfo if item.CRC == info.CRC and \   
              basename(item.filename) == basename(info.filename) ]
   if len(matches) == 0:
       allFound = False
       break

Maybe it is trivial, but I have not yet found a nice way how to do it.

Greetings Michael

Answer 1


It is easy, you should use sets:

if set(list1).difference(set(list2)):
    # lists are different
    # different_items = set(list1).difference(set(list2))
    pass
else:
    # lists are the same
    pass

You can convert your structure to iterables or lists:

list1 = [(i.CRC, basename(i.filename)) for i in old.infolist()]
list2 = [(i.CRC, basename(i.filename)) for i in new.infolist()]

Answer 2


One possible way to do it can be:

def areEqual(old, new):
    set1 = set((x.attribute1, x.attribute2) for x in old)
    set2 = set((x.attribute1, x.attribute2) for x in new)

    return set1 == set2

Answer 3


You can create sets out of old and new lists and then compare them:

old_set = set((item.CRC, item.filename) for item in old_info)
new_set = set((item.CRC, item.filename) for item in new_info)

all_match = new_set.issubset(old_set)  # or old_set.issuperset(new_set)

Answer 4


You can start by sorting the lists. It has only a bigO of n log n and then you can just compare the elements one by one and stop if you find a pair that does not match.