Thursday, July 13, 2017

PhD Defense - Automated Data-Driven Hint Generation for Learning Programming

Kelly Rivers defended her PhD work this afternoon.  She will returning to CMU this fall as a teaching professor.

Student enrollment is increasing, so more work is needed to automate the support, as TAs / instructors are not scaling.  Prior work (The Hint Factory) developed models based on prior student submissions, and then a current student's work can be found within the model thus providing suggestions for how to proceed.  However, programming may not fit within this model due to the larger and more varied space for which students can solve the problems.

First, student code proceeds through a series of canonicalization steps - AST, anonymized, simplification.  Such that the following python code is transformed:

import string
def any_lowercase(s):
  lst = [string.ascii_lowercase]
  for elem in s:
    if (elem in lst) == True:
      return True
    return False

Becomes

import string
def any_lowercase(p0):
  for v1 in p0:
    return (v1 in string.ascii_lowercase)

Studies then went over 41 different problems with hundreds of correct solutions and thousands of incorrect solutions.  The model can then generate the edits and chain these hints as necessary.  In more than 99.9% of cases, the model could successfully generate a hint chain to reach a correct solution.

To further test this model and approach, the model started with the empty space (just teacher solution) and was compared against the final model.  Ideally, the final model will propose fewer edits than the initial model.  And for 56% of problems, this was true.  40% of problems were already optimal.  And 3% are opportunities for improvement to the model.

Next, given this model exists, how do the hints impact student learning?  Select half of the students to give them access to the hint model optionally.  Using a pre / post assessment, the measurement was a wash.  Instead, a second study was designed that required the students to use the system within a two hour OLI module.  Hints would be provided with every submission and either before or after the midtest in the OLI module.  Only 1/2 of the students actually proceeded through the module in order.  However, most learning was just within the pretest->practice->midtest, so adding those students increased the population.  The results show that the hints reduce the time required to learn the equal amount.

From interviews with students, students need and want targeted help on their work.  However, the hints generated thus far were not always useful.  Proposed another study based on different styles of hints: location, next-step, structure, and solution.  This study found that participants with lower expertise wanted more detailed hints.  Hint usage would sometimes be for what is wrong versus how to solve it.  And often, students know what to do, and just need to reference (via example / prior work) how to do this, rather than hinting what to do.

Monday, July 10, 2017

Wrote my own Calling Convention

I have been listening to the Dungeon Hacks audiobook, and it has both reminded of my past joys of playing Angband, as well as some interesting little "hacks" I did in high school.

In high school, I wrote many programs on my TI-83, often to help me in other classes, which lead to an interesting conversation:
Student: "Teacher, Brian has written programs on his calculator for this class."
Teacher: calls me forward "Brian, is this true?"
Me: "Yes."
Teacher: "Did you write them yourself?"
Me: "Yes."
Teacher: "I do not see what the problem is."

And besides writing useful programs, I also wrote games.  All in the TI-83's BASIC.  However, the TI-83 only had 24kB of space for programs and data.  And eventually I started exceeding this limit.  So I started finding interesting hacks that would reduce the space of programs, such as the trailing " of a string is not required if that string ends the line.  Each would save 1 byte, but it started to add up.

Now, the TI-BASIC on the 83 was very primitive.  Particularly it had no GOSUB instruction, only GOTO.  So you cannot write functions / subroutines and reuse code, which would both improve the design and reduce the space required.  Now I already knew the basics (no pun intended) of C, so I knew that code should be able to call other functions and get values back.  But TI-BASIC would let you call another program and then return to the calling program after the called one finished.  That's like a function call, right?

Therefore, I wrote a library program.  Variables were global, so the calling program could set specific parameters in the global variables, call the library program.  The library would use one parameter to determine which functionality to execute, update the necessary globals and then exit, thus returning.  And consequently, I had a library of common functions which sped up my development and reduced the space I needed.

And so it was only in listening to the audio book last week, did I realize that long ago I had developed a simple calling convention in high school.  And now I teach, among other things, the x86-64 Linux ABI (i.e. calling convention) to college students.

A calling convention just dictates how each register is used when calling a function.  Which ones need to be preserved, arguments, return value, etc.  And it also dictates the management of the stack.