JP Vossen via plug on 15 Apr 2022 14:41:51 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
[PLUG] Book: Data Science at the Command Line |
I mentioned this one in the "A list of new(ish) command line tools" thread, but I've finished reading it now and wanted to call some things out. For us on this list, if nothing else, go read https://datascienceatthecommandline.com/2e/list-of-command-line-tools.html. I really liked this book. It reads well, especially for such a potentially dry topic, it's engaging, and it just flows. I learned about a TON of commands I didn't know existed, but that's because most of them aren't installed by default, or in repos at all. That's a bummer, but knowing the tool exists is sometimes 95% of the battle. I've called out 3 really interesting CSV commands in the list below. Full disclosure, the author recommended one of my books in a couple of places, but I didn't know that until I read it. ---- * https://www.oreilly.com/library/view/data-science-at/9781492087908/ * Data Science at the Command Line, 2nd Edition * by Jeroen Janssens * Released August 2021 * Publisher(s): O'Reilly Media, Inc. * ISBN: 9781492087915 * Code: https://github.com/jeroenjanssens/data-science-at-the-command-line * Site: https://datascienceatthecommandline.com/ * Read it for free: https://datascienceatthecommandline.com/2e/ * AWESOME list of CLI tools: https://datascienceatthecommandline.com/2e/list-of-command-line-tools.html * The only problem is that many of these tools are NOT installed by default, and some are not in repos. * But some are: apt install moreutils cvstools * And there's a (largeish Ubuntu zsh) Docker container with ALL of those tools! * Read: https://datascienceatthecommandline.com/2e/chapter-2-getting-started.html * docker pull datasciencetoolbox/dsatcl2e * docker run --rm -it -v "$PWD":/data datasciencetoolbox/dsatcl2e Amazing tools: https://csvkit.rtfd.org: sudo apt install csvkit # LOTS of Python modules csvcut Filter and truncate CSV files csvgrep Search CSV files csvjoin Execute a SQL-like join to merge CSV files on a specified column or columns
csvlook Render a CSV file in the console as a Markdown-compatible, fixed-width table
csvsort Sort CSV files
csvsql Execute SQL statements on CSV files <<<<<<<<<<<<<<<<<<<<<<<
csvstack Stack up the rows from multiple CSV files
csvstat Print descriptive statistics for each column in a CSV file
in2csv Convert common, but less awesome, tabular data formats to CSV in2csv data.xls > data.csv sql2csv Execute an SQL query on a database and output the result to a CSV file Related: json2csv Convert JSON to CSV https://github.com/jehiah/json2csv Or: CSVKit: in2csv data.json > data.csv xml2json Convert an XML input to a JSON output, using xml-mapping https://github.com/parmentf/xml2json ---- Later, JP -- ------------------------------------------------------------------- JP Vossen, CISSP | http://www.jpsdomain.org/ | http://bashcookbook.com/ ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug