JP Vossen via plug on 10 Jul 2021 09:39:43 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] Python nested dict data structure |
Thanks Victor, that's much more what I thought I should be able to do! I had to add the `company` key back in to the second part, but then it works at that level. It took me several tries to get it to work at a level below that, but that's probably just me. I also find this a bit hard to read, but that's also probably just me and might be better with more meaningful real-world variable names. The only flaw is that encountered field order in input files matters, which I'll explain in a moment. Modified code (works for Python2 or 3): ``` 1 #!/usr/bin/env python2 2 # dict2.py--do I REALLY have to do all this crap just for NESTED dicts!? 3 # 2021-07-10 4 # From Victor in EM "Re: [PLUG] Python nested dict data structure" 5 6 import json 7 from collections import defaultdict 8 9 d = dict() 10 11 # Main 12 company = 'Acme Inc' # Key in both (all) files 13 14 # If "company" key exists, add or set the "region" subkey; 15 # else the "company" key should default to an empty dictionary then 16 # add or set the "region" subkey. 17 # 18 # First read file 1, containing: Company\tRegion\tOther-stuff-I-don't-care-about-here 19 d.setdefault(company, dict())['region'] = 'US' 20 21 # If "counter" key exists, get its current value and add X; 22 # If "counter" does not exist, return 0 as the default and add X. 23 # 24 # ...LATER...read file 2, containing *multiple records* of: Company\tthis\tthat\Counter 25 d[company]['counter'] = d[company].get('counter', 0) + 2 26 d[company]['counter'] = d[company].get('counter', 0) + 3 27 28 # ...STILL LATER...read file 3, containing even more crazy stuff 29 d[company].setdefault('subkey', dict())['subsubkey'] = 'foo' 30 d[company]['subkey']['subsubint'] = d[company]['subkey'].get('subsubint', 0) + 6 31 32 #print(d) 33 print(json.dumps(d, indent=2, sort_keys=True)) # Pretty but needs: import json ``` Expected and desired output: ``` { "Acme Inc": { "counter": 5, "region": "US", "subkey": { "subsubint": 6, "subsubkey": "foo" } } } ``` That's all great. But if I move line 30 above line 29, it fails: ``` $ ./dict2.py Traceback (most recent call last): File "/home/jp/MyDocs/HOME/CODE/Python/dict2.py", line 29, in <module> d[company]['subkey']['subsubint'] = d[company]['subkey'].get('subsubint', 0) + 6 KeyError: 'subkey' ``` The same would happen a the `company` level too. I know why...it's the original problem, the key doesn't exist yet. My problem is that I am cherry-picking fields out of 2+ files and my input might be in any order. This longer code works in any order, as you'll see because I moved the lines around without changing the comments (works for Python2 or 3): ``` #!/usr/bin/env python2 # dict.py--do I REALLY have to do all this crap just for NESTED dicts!? # JP, 2021-07-02 # https://www.geeksforgeeks.org/python-nested-dictionary/ # but... https://stackoverflow.com/questions/1024847/how-can-i-add-new-keys-to-a-dictionary # And >EM "[PLUG] Python nested dict data structure" 2021-07-05 import json mydict = {} # Not a good idea to use "dict" # Add Nested Key/Value pair to mydict def add_nkv(mydict, key, subkey, value): if key not in mydict: # Key must exist or error mydict[key] = {} # Create empty sub-mydict mydict[key][subkey] = value # Add value # Accumulate a value in a nested mydict def acc_nkv(mydict, key, subkey, value): if key not in mydict: # Parent key must exist or error mydict[key] = {} # Create empty sub-mydict if subkey not in mydict[key]: # Subkey must exist or error mydict[key][subkey] = value # Add new value else: mydict[key][subkey] += value # Accumulate value # Main company = 'Acme Inc' # Key in both (all) files # ...LATER...read file 2, containing *multiple records* of: Company\tthis\tthat\Counter acc_nkv(mydict, company, 'counter', 2) # Does: mydict[company]['counter'] += val acc_nkv(mydict, company, 'counter', 3) # First, read file 1, containing: Company\tRegion\tOther-stuff-I-don't-care-about-here add_nkv(mydict, company, 'region', 'US') # Does: mydict[company]['region'] = val # ...STILL LATER...read file 3, containing even more crazy stuff acc_nkv(mydict[company], 'subkey', 'subsubint', 6) # Does: mydict[company]['subkey'][subsubint' += val add_nkv(mydict[company], 'subkey', 'subsubkey', 'foo') # Does: mydict[company]['subkey'][subsubkey' = 'foo' #print(mydict) # Simple print(json.dumps(mydict, indent=2, sort_keys=True)) # Pretty but needs: import json ``` Output: ``` $ ./dict.py { "Acme Inc": { "counter": 5, "region": "US", "subkey": { "subsubint": 6, "subsubkey": "foo" } } } ``` On 7/6/21 10:40 AM, Victor via plug wrote:
# Add/inc_nstkv a value in a nested d def inc_nstkv(d, key, subkey, value): d.setdefault(key, dict()) d[subkey] = d.get(subkey, 1) + 1************************************* d[subkey] = d.get(subkey, 0) + value *************************************### But that doesn't use `value`. I guess I should have called ### it accumulate and not increment. This doesn't work: d[subkey] = d.get(subkey, 1) += value ### And I think it overwrites the region value, depending on order ### (which will be unpredictable).Oops, you're right that .get() should be using 0 as the default and +value; fixed inline above. What led me to think of .setdefault() and .get() was that you wrote their exact logic using other code plus the explanation that you're pulling data from multiple sources where you don't have a uniform dictionary output in mind. You can even eliminate the functions you created entirely using .setdefault() and .get(), but it's up to you if that diminishes readability. Example below.And I think it overwrites the region value, depending on orderI don't believe that's a problem, but maybe I'm not understanding your expected output. ``` d = dict() company = 'Acme Inc' # If "company" key exists, add or set the "region" subkey; else the "company" key should default to an empty dictionary then add or set the "region" subkey. d.setdefault(company, dict())['region'] = 'US' # If "counter" key exists, get it's current value and add X; if "counter" does not exist, return 0 as the default and add X. d['counter'] = d.get('counter', 0) + 1 d['counter'] = d.get('counter', 0) + 2 print(d) ``` ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug
Thanks, JP -- ------------------------------------------------------------------- JP Vossen, CISSP | http://www.jpsdomain.org/ | http://bashcookbook.com/ ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug