[ Pythonic Way to compare two unordered lists by attributes ]
what is the most pythonic way to compare two unordered lists by one or more of their attributes? I would love to know if there is a pythonic way to find out if for each item in a list A there exists an item in list B where the item from list A and the item in list B match in a specified attribute.
In my example case, I have two .zip files in a unit test, and want to test, if the files match, but I am really looking for a good general solution for my personal toolset. This was my first attempt:
with ZipFile('A.zip') as old:
with ZipFile('B.zip') as new:
oldFileInfo = old.infolist()
allFound = True
for info in new.infolist():
matches = [item for item in oldFileInfo if item.CRC == info.CRC and \
basename(item.filename) == basename(info.filename) ]
if len(matches) == 0:
allFound = False
break
Maybe it is trivial, but I have not yet found a nice way how to do it.
Greetings Michael
Answer 1
It is easy, you should use sets:
if set(list1).difference(set(list2)):
# lists are different
# different_items = set(list1).difference(set(list2))
pass
else:
# lists are the same
pass
You can convert your structure to iterables or lists:
list1 = [(i.CRC, basename(i.filename)) for i in old.infolist()]
list2 = [(i.CRC, basename(i.filename)) for i in new.infolist()]
Answer 2
One possible way to do it can be:
def areEqual(old, new):
set1 = set((x.attribute1, x.attribute2) for x in old)
set2 = set((x.attribute1, x.attribute2) for x in new)
return set1 == set2
Answer 3
You can create sets out of old and new lists and then compare them:
old_set = set((item.CRC, item.filename) for item in old_info)
new_set = set((item.CRC, item.filename) for item in new_info)
all_match = new_set.issubset(old_set) # or old_set.issuperset(new_set)
Answer 4
You can start by sorting the lists. It has only a bigO of n log n and then you can just compare the elements one by one and stop if you find a pair that does not match.