| JP Vossen via plug on 10 Jul 2021 09:39:43 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
| Re: [PLUG] Python nested dict data structure |
Thanks Victor, that's much more what I thought I should be able to do!
I had to add the `company` key back in to the second part, but then it works at that level. It took me several tries to get it to work at a level below that, but that's probably just me. I also find this a bit hard to read, but that's also probably just me and might be better with more meaningful real-world variable names.
The only flaw is that encountered field order in input files matters, which I'll explain in a moment.
Modified code (works for Python2 or 3):
```
1 #!/usr/bin/env python2
2 # dict2.py--do I REALLY have to do all this crap just for NESTED dicts!?
3 # 2021-07-10
4 # From Victor in EM "Re: [PLUG] Python nested dict data structure"
5
6 import json
7 from collections import defaultdict
8
9 d = dict()
10
11 # Main
12 company = 'Acme Inc' # Key in both (all) files
13
14 # If "company" key exists, add or set the "region" subkey;
15 # else the "company" key should default to an empty dictionary then
16 # add or set the "region" subkey.
17 #
18 # First read file 1, containing: Company\tRegion\tOther-stuff-I-don't-care-about-here
19 d.setdefault(company, dict())['region'] = 'US'
20
21 # If "counter" key exists, get its current value and add X;
22 # If "counter" does not exist, return 0 as the default and add X.
23 #
24 # ...LATER...read file 2, containing *multiple records* of: Company\tthis\tthat\Counter
25 d[company]['counter'] = d[company].get('counter', 0) + 2
26 d[company]['counter'] = d[company].get('counter', 0) + 3
27
28 # ...STILL LATER...read file 3, containing even more crazy stuff
29 d[company].setdefault('subkey', dict())['subsubkey'] = 'foo'
30 d[company]['subkey']['subsubint'] = d[company]['subkey'].get('subsubint', 0) + 6
31
32 #print(d)
33 print(json.dumps(d, indent=2, sort_keys=True)) # Pretty but needs: import json
```
Expected and desired output:
```
{
"Acme Inc": {
"counter": 5,
"region": "US",
"subkey": {
"subsubint": 6,
"subsubkey": "foo"
}
}
}
```
That's all great. But if I move line 30 above line 29, it fails:
```
$ ./dict2.py
Traceback (most recent call last):
File "/home/jp/MyDocs/HOME/CODE/Python/dict2.py", line 29, in <module>
d[company]['subkey']['subsubint'] = d[company]['subkey'].get('subsubint', 0) + 6
KeyError: 'subkey'
```
The same would happen a the `company` level too. I know why...it's the original problem, the key doesn't exist yet. My problem is that I am cherry-picking fields out of 2+ files and my input might be in any order.
This longer code works in any order, as you'll see because I moved the lines around without changing the comments (works for Python2 or 3):
```
#!/usr/bin/env python2
# dict.py--do I REALLY have to do all this crap just for NESTED dicts!?
# JP, 2021-07-02
# https://www.geeksforgeeks.org/python-nested-dictionary/
# but... https://stackoverflow.com/questions/1024847/how-can-i-add-new-keys-to-a-dictionary
# And >EM "[PLUG] Python nested dict data structure" 2021-07-05
import json
mydict = {} # Not a good idea to use "dict"
# Add Nested Key/Value pair to mydict
def add_nkv(mydict, key, subkey, value):
if key not in mydict: # Key must exist or error
mydict[key] = {} # Create empty sub-mydict
mydict[key][subkey] = value # Add value
# Accumulate a value in a nested mydict
def acc_nkv(mydict, key, subkey, value):
if key not in mydict: # Parent key must exist or error
mydict[key] = {} # Create empty sub-mydict
if subkey not in mydict[key]: # Subkey must exist or error
mydict[key][subkey] = value # Add new value
else:
mydict[key][subkey] += value # Accumulate value
# Main
company = 'Acme Inc' # Key in both (all) files
# ...LATER...read file 2, containing *multiple records* of: Company\tthis\tthat\Counter
acc_nkv(mydict, company, 'counter', 2) # Does: mydict[company]['counter'] += val
acc_nkv(mydict, company, 'counter', 3)
# First, read file 1, containing: Company\tRegion\tOther-stuff-I-don't-care-about-here
add_nkv(mydict, company, 'region', 'US') # Does: mydict[company]['region'] = val
# ...STILL LATER...read file 3, containing even more crazy stuff
acc_nkv(mydict[company], 'subkey', 'subsubint', 6) # Does: mydict[company]['subkey'][subsubint' += val
add_nkv(mydict[company], 'subkey', 'subsubkey', 'foo') # Does: mydict[company]['subkey'][subsubkey' = 'foo'
#print(mydict) # Simple
print(json.dumps(mydict, indent=2, sort_keys=True)) # Pretty but needs: import json
```
Output:
```
$ ./dict.py
{
"Acme Inc": {
"counter": 5,
"region": "US",
"subkey": {
"subsubint": 6,
"subsubkey": "foo"
}
}
}
```
On 7/6/21 10:40 AM, Victor via plug wrote:
# Add/inc_nstkv a value in a nested d def inc_nstkv(d, key, subkey, value): d.setdefault(key, dict()) d[subkey] = d.get(subkey, 1) + 1************************************* d[subkey] = d.get(subkey, 0) + value *************************************### But that doesn't use `value`. I guess I should have called ### it accumulate and not increment. This doesn't work: d[subkey] = d.get(subkey, 1) += value ### And I think it overwrites the region value, depending on order ### (which will be unpredictable).Oops, you're right that .get() should be using 0 as the default and +value; fixed inline above. What led me to think of .setdefault() and .get() was that you wrote their exact logic using other code plus the explanation that you're pulling data from multiple sources where you don't have a uniform dictionary output in mind. You can even eliminate the functions you created entirely using .setdefault() and .get(), but it's up to you if that diminishes readability. Example below.And I think it overwrites the region value, depending on orderI don't believe that's a problem, but maybe I'm not understanding your expected output. ``` d = dict() company = 'Acme Inc' # If "company" key exists, add or set the "region" subkey; else the "company" key should default to an empty dictionary then add or set the "region" subkey. d.setdefault(company, dict())['region'] = 'US' # If "counter" key exists, get it's current value and add X; if "counter" does not exist, return 0 as the default and add X. d['counter'] = d.get('counter', 0) + 1 d['counter'] = d.get('counter', 0) + 2 print(d) ``` ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug
Thanks, JP -- ------------------------------------------------------------------- JP Vossen, CISSP | http://www.jpsdomain.org/ | http://bashcookbook.com/ ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug