The Issues
Setup Issues:
-
“un-tarred" isn’t obvious to everyone, tatyana thought it meant don’t extract it.
-
People naturally try to use Java 9 (it’s the first thing on that webpage)
-
Installing Java 8 after Java 9 is really annoying on mac (we should provide a tool for switching default java versions)
-
parsimonious is not installed by default
-
Spark 2.2.1 is the default spark now, so people immediately try to download that.
UX/UI:
-
TString() is probably going to seem weird (isn’t that different from the table printout?) I saw this when printing an expression
-
nNotMissing from stats aggregator should use underscores
-
should we have an option to print strings with quotes?
-
we should probably just use none consistently everywhere since that’s the python name for missing (or maybe NA?)
-
when do I use
set
versushl.set
? (never useset
?)
Presentation:
-
run describe before and after select, annotate, and transmute
-
Make sure each slide fits on a page
-
the discussion on joins is a bit abstract, especially this
table[expr]
bit, I think it’s natural for us PL folks, but maybe it would help to show an example -
in Joins, it would be helpful for the users if we showed the three tables and their fields first, so they can see the connections for themseles
-
A visual example of
explode
is probably more helpful (maybe actually use a small table andshow()
):
a b c [1,2,3,]
=>
a b c 1
a b c 2
a b c 3
-
I feel like we should use the VDS picture in the MatrixTable section
-
focus on the word “compound key” rather than "two keys” (I think the phrase “two keys” can be confusing with the matrix table’s two axes)
Action Items for the Team
We should distribute these items among the team:
-
eliminate camel case everywhere
-
automate as much of the install as possible for OS X and GNU/Linux systems
-
harmonize printed form of types (@tpoterba are you already working on this? what is the preferred printed form?)
-
consistent display and naming of the missing value