Thoughts and experiences on various topics: puzzles, games, AI, collaboration, music, politics, and whatever else is on my mind

Archive for July, 2013

Notes on a Life in Progress – part 5 (career, marriage, parenthood 1985-1996)

GTE Laboratories (1985-1996)

In July of 1985, I started work at GTE Laboratories in Waltham, MA.  I was a Senior Member of Technical Staff in the Self-Improving Systems (Machine Learning) department of the Fundamental Research Lab.  Our mission was to do long-term basic research in machine learning.  This was exactly the environment I wanted to be in – I had stimulating colleagues who shared my interest in machine learning (many of them were PhD’s), and the opportunity to “think deeply about simple things”, i.e. perform basic research.

This was the ideal environment in which to continue my research on “discovery of macro-operators in problem-solving”.  I had begun thinking about this topic back at CMU, when reflecting on how I learned to solve the Rubik’s Cube.  Rich Korf did his Ph.D. thesis on an algorithm for filling in a table of macro-operators to solve certain types of puzzles, such as Rubik’s Cube and the 15-puzzle.  I was thinking about a more general and more heuristic approach where macros would be proposed and then evaluated on the basis of their contribution to improved problem-solving performance.  I continued thinking about this, and began writing some programs to encode my ideas, while teaching at Hampshire.  I wrote and published a paper in IJCAI-85 describing my initial ideas and results.  Because this paper was submitted near the end of my time at Hampshire, and because I anticipated starting work at MITRE, I listed MITRE as my affiliation in the title of the paper.  This turned out to be ironic, when I ended up working at GTE Labs instead.   I was happy that GTE supported my attending IJCAI in Los Angeles that year.  It was my first conference paper and presentation.

The basic idea of my work was based on the informal observation from Rubik’s Cube solving, that it was necessary to (temporarily) undo progress, i.e. subgoals achieved, in order to achieve new subgoals, and ultimately solve the puzzle.  The idea of a macro-operator is a sequence of moves that can transform the puzzle state in a predictable way.  One simple example is to permute the positions of 3 cubies (1x1x1 puzzle elements) but leave everything else unchanged.  Lots of other things get changed (messed up) during the macro, but when the sequence is complete and the dust (mess) has cleared, everything is back where it was except for the 3 cubies that played “musical chairs” (changed their positions).  The learning problem that interested me was how these macros could be discovered  by a program.  My basic approach was to use a heuristic evaluation function  to count the number of subgoals achieved.   My program examined sequences of moves along paths down the search tree, and looked at the values of this heuristic function for each puzzle state encountered along the paths.  Typically, in puzzles like Rubik’s cube, there would be peaks and valleys along the path.  Peaks were where more subgoals were satisfied than in the states before and after.  The valleys corresponded to undoing subgoals, and it seemed natural that the sequence of moves spanning a valley (extending from one peak to the next peak) would be a good candidate for a macro, especially if the 2nd peak was “higher” than the first (meaning that overall progress had been made).

I also made a representational commitment to having macros be described in the same format as primitive problem-solving operators (e.g. the basic twists of Rubik’s cube).   I represented all operators (both primitives and macros) in terms of the change of state they entailed – specifically, in terms of a before and after state.  These were actually partial states, since they could have some parts of the state specified as not relevant (I called them “don’t cares”).  The principal advantage of this representational commitment is that the problem solver does not require modification in order to use additional macros!  The problem-solver (performance system) simply used a set of available operators, and if new macros were found, they could be added to the operator-set and could then get used like any other operator.  The main performance gain  resulting from learning macros is that the search can take larger steps in the problem space, since each macro actually involves multiple primitive steps.

I like to think of this as a chunking model of skill acquisition, with macros being larger chunks defined in terms of simpler chunks.  Chunking is a well-known and well-studied psychological phenomenon, and is the source (imho) of human ability to deal with complexity.   In order to represent my macros in the same format as primitive operators, I needed to compile a sequence  of steps into a single step, which involved analyzing the before and after states spanned by the sequence.  In addition, this compilation process produced a macro-definition or expansion, which could allow any solution found in terms of macros to be expanded into a solution using only primitive operators.  Primitive operators were distinguished and recognized by the fact that their definition (expansion) was empty (NIL).   In fact, macros could be learned in terms of other macros leading to a definitional hierarchy.  One final advantage in my approach was that learning could take place during problem-solving, even before a solution was found.  My first program only learned macros from analyzing a completed solution path, but I later generalized this so that macros could be learned from any path in the search tree, even before a solution was discovered.

I continued to work on this heuristic macro-learning project while employed at GTE Labs.   My work led to the publication, in the prestigious Machine Learning Journal in 1989, of my paper “A heuristic approach to the discovery of macro-operators”.  I am indebted to my friend and editor, Pat Langley, as well as my GTE Labs colleague, Oliver Selfridge, for invaluable help in finishing and publishing this paper.

Sadly, the ideal research environment I experienced at GTE Labs was time-limited.  After my first year there, GTE decided to eliminate the Fundamental Research Lab, and focus on more “applied research”.  This presented a challenge to all of us in the Machine Learning department. Fortunately, we kept our jobs, but the department was moved into a more applied  “Computer and Intelligent Systems Lab”.  We continued to work on machine learning, but there was much greater pressure to apply it to GTE telephone or other operations.

There were (at least) 3 different machine-learning projects within our group, and they had been pursued fairly independently by 3 different researchers.  We were directed to work on ways to integrate these disparate learning approaches (Decision-tree Induction,  Rule-based Learning, and Macro Learning) into a unified project.  We struggled with how to do this, but in the end we succeeded in creating the ILS (Integrated Learning System), in which each learning method both proposed actions to take (performance system) and tried to improve it’s own behavior (using it’s individual learning system).  The integration involved a TLC  (guess who proposed that acronym?) which I called The Learning Coordinator.  The TLC would collect action proposals from each sub-system, and distribute all the proposals to each sub-system.  Then each sub-system would give a rating to each proposal, according to how well it thought the proposed action would work out if actually performed.  These ratings (numerically weighted votes) were collected and averaged, and the highest rated action would be performed.   The results of the action (the next state of the world or environment) would be made available to each sub-system for use in its own learning.  This seemed to me like a fairly simple idea, and it was the only one we implemented – it was actually a fallback from discussions and proposals we had for much more complex systems, but we never got agreement or traction on implementing the more ambitious proposals.

I thought this ILS framework was an interesting idea that merited further develop, and quantitative analysis.  I proposed experiments I thought we should do to see how much benefit arose from the collaboration of the different systems.  I was already a fan of collaboration from my ESG experiences at MIT.   There are careful and (to me) obvious experiments that could be done to measure and compare the learning of each sub-system (in isolation) with its learning in the context of joint collaboration.  It’s not clear whether the ILS would outperform the individual systems working alone, but there are at least 2 reasons for hope:

1.   With 3 alternate actions to choose from, one hopes that the action chosen would often be better than that proposed by any single sub-system.  This is simply a performance issue, and does not rely on learning.

2.   Each sub-system (agent) would likely see actions taken that were different from its own proposal, and this should expand its learning opportunities.

My proposed experimental framework was fairly simple:

a.  Define a performance metric to  evaluate performance (on a simulated, thus repeatable, world)

b.  Use the performance metric to evaluate Agents, and the ILS as a whole, with all learning turned off.  These provide the baselines

c..  Have each Agent perform and learn in isolation, and also have the ILS system as a whole perform and learn.

d.  Finally evaluate the performance of each individual Agent, and the whole ILS, again with learning turned off.

The interesting questions to me are:

1. How much learning occurred within individual agents?

2.  Did the ILS ensemble learn more than the individual agents on their own?

Learning would be “measured” as the difference in performance scores.

It saddened me that my colleagues resisted doing these experiments.  I understood this to be for “political reasons” – the expressed concern was that failure (if the system didn’t learn well) was viewed as much more dangerous than any success.   I hated this attitude – which strikes me as unscientific (“I’d rather not find out at all, than find out I was wrong”?).   In case you haven’t figured this out about me,  I deplore politics, especially when it impedes progress.   I still think these experiments would be interesting to perform (perhaps using different agents), and maybe I’ll get back to them someday …

Oh yeah – one other ironic note:   Our team was nominated for and received a prestigious corporate award for our work!  It makes a great resume entry under Awards and Honors:

      GTE’s Leslie H. Warner Technical Achievement Award, May, 1992, for work on the Integrated Learning System. The Warner Award is GTE’s highest technical achievement award.

We got a chunk of money to divide up, and an all-expenses paid trip to New York City for the award ceremony (presented by the CEO of GTE).  We also got several publications out of the work.  But sadly, I don’t think it has had much impact on the world – unless someone read our papers and found a way to use the work.  My colleagues, being risk-averse, would not consider trying to deploy the system anywhere within GTE – the dangers of failure outweighing any possible benefits.

This all strikes me as Dilbert-esque.  Maybe there’s a good reason – Scott Adams worked at PacTel (another telephone company), which provided him with experiences that feed into his Dilbert comic strip.  GTE exhibited nearly all the craziness on display in Dilbert, and sadly there is more truth behind it than you’d expect!   At GTE we had frequent reorganizations, during which little work got done.  We had management that seemed to hinder, rather than support our work.  There was a tendency to avoid doing things that could lead to “visibility” and/or “perceived failure” (no such thing as a successful negative result – in science finding out that something didn’t work can be a step of progress toward finding something that does work – but at GTE, if it didn’t work, you were a failure, so better not to try things).

My career at GTE Labs came to a crashing halt in 1996 when our entire 4-person team was laid off.   We had the option of seeking other positions, but most of us ended up leaving for other jobs.  GTE had been my “work home” for 11 years, and I was sad to leave.   On reflection, I feel that I (and we) could have done much better work given more supportive and encouraging circumstances.  My colleagues were all extremely smart, and I consider them long-term friends to this day.   I have come to believe that large corporations are often impediments to progress (scientific, technological, and social).  More on that another time, perhaps.

Marriage and Parenthood

While working at GTE, things got better in my marriage.  We had economic stability, a house we had purchased and fixed up, and our son, Aaron.   I loved being a father, and have fond memories of interacting & playing with him,  reading to (and later with) him, and playing video games.  The next major highlight of these years was the birth of our 2nd son, David …

Our Family Grows!  (David born July 7, 1987)

My wife and I welcomed our 2nd son, David, into our family in July of 1987.  This was a very happy time for all of us.  Aaron seemed to adapt well to his role as “big brother”, and we added a 2nd bedroom to the small ranch house we lived in.  I have very fond memories (and some favorite pictures) of carrying David around on my shoulder when he was an infant.

Interestingly,  David arrived at 2:06am, and our home address at the time was 206 Concord Ave.  Coincidence?  Probably!   Interesting?  I love it!

Around 1989 we sold our house in order to purchase and move to a larger residence, still in Lexington, MA.  Moving is always stressful, but we got through it.   I remember my father coming up from PA to help out with packing and moving (interesting side note:  for all my life I’ve lived in the MA and PA states ! ).   We moved in January 1990, I think, and I remember we had a horrible ice storm the day of the move — the driveway of the new place was a sheet of ice!   Despite the challenges, we got everything unloaded, and started to settle into our new house, which had much larger space (3 bedrooms).  Within a year, our family was to grow yet again!

Birth of a daughter!  (Rachel arrives Dec. 2, 1991)

On the first day of Hannukah, our family grew once again with the arrival of our 1st daughter, Rachel!  My wife and I, as much as we enjoyed our 2 boys, were grateful to have a daughter, too!  All 3 kids have been a special blessing, and I feel it has been my privilege and honor to be their father.   When I look back on my life,  being a parent (and, if I do say so myself, a rather good one) is my proudest accomplishment (of course it’s not done yet — I’m still their parent, and hopefully can still contribute a bit to their growth and development – and they all continue to contribute to mine, as well!).

I think I’ll end this note on that upbeat note !

Next time:  career transitions:  researcher -> software developer -> free-lance puzzle designer

… to be continued

Notes on a Life in Progress – part 4 (Early career, marriage, parenthood 1979-1985)

Pittsburgh, PA  – CMU (1979-1981)

The first step in my career path was moving to Pittsburgh (with my new wife) in order to work as a programmer in the Psychology Department at CMU (Carnegie-Mellon University).  I turned down an industry offer from Texas Instruments at a higher salary.  I was always interested in academia, and I worried that transitioning from industry to academia (and taking a likely salary cut) would be more difficult than the opposite transition.  I was also curious to experience CMU, since it was another of the major U.S. centers of AI Research (along with MIT and Stanford).  The programming work was supported by an ONR contract and involved work on learning, so I was excited about it.

At CMU, the Psychology Department and Computer Science both worked on AI and machine learning, led by the famous duo of Herbert Simon (Psychology) and Allen Newell (Computer Science).  In addition to the faculty, I found the graduate students in both departments to be very friendly, welcoming, and stimulating to interact with.   A number of long-term friendships arose out of my time at CMU.

Pittsburgh had somewhat of a negative reputation, so I was prepared to be disappointed by it (relative to Boston/Cambridge which I loved!).  I was pleasantly surprised — the city was much cleaner than it was in the past, and the people were friendly.  Though it didn’t offer all that Boston did, there was still plenty to do.  I have fond memories of going white-water rafting on the Youghiogheny River, visiting Fallingwater (Frank Lloyd Wright house), and playing lots of tennis and racquetball with friends and colleagues.

One of the things I most enjoyed about CMU was the more accepting attitude toward research involving games and puzzles.  Newell and Simon studied (and wrote the book on) Human Problem Solving, and used puzzles such as Tower of Hanoi and Missionary and Cannibals as vehicles for exploration.  Richard Korf even wrote his Ph.D. thesis (in CS) on an algorithm to calculate macro-operators for solving Rubik’s Cube and 15-puzzle among others.  Hans Berliner studied and made contributions to computer chess playing.

During my time at CMU, I had the pleasure of meeting Pat Langley, then a new post-doc in Psychology, having written his PhD thesis on BACON, a machine approach to scientific-discovery.  He and I hit it off due to our mutual commitment to Machine Learning as a key to AI, and have become life-long friends.  I have vivid recollections of participating, along with Pat and many others, in a project to build a simulated world as a testbed for AI and ML research.  This group, called by the somewhat grandiose name of world-modelers, sparked numerous interesting discussions.  We all shared a commitment to the idea that simulating a testbed environment had numerous advantages compared with the “real world” as used in robotics work:

1. No worries about “hardware” breaking (a bane of robotics researchers)

2. Greater reproducibility or results

3. Ability to modify the (simulated) environment in carefully controlled ways

4. Software can be copied and shared, so simulation tools can easily be used by many other researchers

5. A simulated environment can be simple (if desired),  to allow focusing on critical issues.

The last point was actually controversial within the group.  I advocated for starting out with very simple “worlds”, because those could be more easily programmed, getting us to the actual AI researching business much more quickly.  There were others, especially with interests in machine vision, that argued for a realistic 3-D simulated physical environment.  I would have been quite content with a simple, abstract, 2-D grid-world. Unfortunately, this issue divided the group, and my recollection is that things never really “got off the ground”.  Nevertheless, I continue to this day, to believe that simulating simple grid-world environments is a valuable way to explore AI/ML.

Not much to say about my marriage during this time, though my wife expressed unhappiness about Pittsburgh (she complained that it wasn’t near the ocean), and I speculate that she may have harbored a touch of resentment at “following me there”.

Northampton, MA – Hampshire College (1981-1985)

Sadly, the funding ran out for my programming work at CMU, and after just under a year, I found myself unemployed.  I began a job search, including looking at private schools and some colleges.  I was thrilled to receive an offer to teach Computer Science at Hampshire College (in South Amherst, MA).  Hampshire College was (and is) an experimental college, created as a “5th college” to join Smith, Amherst, Mt. Holyoke, and U.Mass. Amherst.  It was designed and started in the late 60’s, during the same time period that ESG (see earlier post) was formed at MIT.   Hampshire and ESG have a number of similarities.  They are both committed to student-directed education, fostering a shared sense of community, interdisciplinary studies, and educational innovation.  I jumped at the opportunity to join the Hampshire faculty, and so in Summer 1981, my wife and I moved to Northampton, MA, where we lived in an apartment adjacent to the Smith campus.  My wife was happy to be back in Massachusetts and closer to her parents who lived north of Boston.

I loved the “5-college area”, which was like a smaller-scale Boston/Cambridge.  I enjoyed all that the 5 colleges had to offer in terms of both social and academic activities and stimulation. I made many friendships, both on and off campus, and at Hampshire with both faculty and students.  I found the students at Hampshire to be very motivated and energetic, and it was a pleasure to interact with them.  Students propose their own courses of study, so I had many meetings with students to discuss projects and areas of study.

I was part of the School of Language and Communication (called “L&C” for short).  Hampshire had 4 Schools, rather than departments, in part to encourage interdisciplinary interaction.  L&C (later re-named to Communication and Cognitive Science) had 2 primary foci:  cognitive science (psychology, linguistics, math, logic, computer science, and philosophy) and communications studies (media, history of technology, among others).  I was the faculty person representing AI and computer science.

While Hampshire encouraged individual student projects and studies, there were also courses taught by faculty.   Team-teaching was encouraged, and I have fond memories of co-teaching a number of courses with colleagues.  Perhaps my favorite was called Structures of Computation where we examined the different levels of organization involved in computation.  This covered the span from low-level hardware (transistors, gates and flip-flops) through high-level software such as compilers and interpreters (and all the levels in-between, such as ALU’s, microcoding, machine code, and assemblers).  I am fascinated with how complex structures can be built up out of simpler components (modules).  Perhaps this stems from all the time I spent playing with wooden blocks, lincoln logs, and bricks (pre-Lego) as a very young child.  I believe this powerful concept of modularity, and hierarchical (layered) structuring, is a cornerstone of most if not all areas of engineering.  I think it is fundamental, as well, to learning and skill acquisition (but more on that another time!).

In 1981,  I bought my first personal computer, an Apple II+, along with (dot-matrix) printer, modem (300-baud!), color monitor, and 2 external floppy disk drives.  It seems unbelievable today that this machine could do all it did with only 64K of RAM (and that was the “souped up” hardware configuration). I remember programming in LOGO and Apple BASIC.  LOGO was the turtle-graphics language pioneered by Seymour Papert for use in teaching children computational and mathematical thinking and problem-solving.   Later, I added a SoftCard (Microsoft’s early Z80 hardware plug-in board), so I could run CPM as well.   I have fond memories of playing around with this computer.  I spent more than $4000 on the computer, peripherals, add-ons, and software.  Amazing how over the years computer performance has increased so much, and prices have dramatically declined!

One of my initiatives at Hampshire was an attempt to create an ESG-like program for computer science students, which I called The Learning Community (TLC).   It was enthusiastically embraced by a number of students, and we had regular meetings, published a weekly newsletter, and engaged in learning and sharing about a variety of interesting topics.  The greatest impediment to greater success was that we lacked a dedicated physical space where our community could congregate to interact – this seemed to be a key ingredient in ESG’s success.  Nevertheless, I was encouraged by the student’s enthusiasm, and hoped to continue to grow the program.  I even explore seeking dedicated space in a dorm in order to “house” TLC.  This would have improved, in my opinion, on ESG, by further integrating living and learning which is an ideal I fully support.  Unfortunately, not all my faculty colleagues shared my enthusiasm for TLC, and I paid a political price for “forging ahead” with it (more later).

Parenthood!  (Aaron born June 18, 1983)

I always knew I wanted to be a parent.  My relationship with my own father was a mixture of positives and negatives (more on this another time).   I aspired to be the kind of “ideal” father that I had always wished for.  I got my chance to try when my first son, Aaron, was born on June 18, 1983.  This was a highlight of my life!  It was also a wonderful Father’s Day present, since Aaron arrived on Saturday morning at 6:18 a.m.(was he a budding numerologist?  6:18 on 6-18-83!), which was the day before Father’s Day that year.   I sat down (at my Apple II) that night, and wrote a long letter to my son, hoping he’d read it when he was older.  I remember expressing my excitement, and hopes and aspirations for our relationship, and encouraging him to grow and develop into the best person he could be.

Parenthood wasn’t easy, though.  There were many sleepless nights, diapers to change, feedings to do, etc.  The stresses exacted a toll on my marriage, unfortunately.  I was also still dealing with my teaching work at Hampshire, and facing a reappointment review during the 1983-84 academic year.

The positives were truly great, and I have absolutely no regrets!  I remember we bought a video camera (an early Olympus VCR cassette system) shortly before the birth, and I had a great time learning to use it, and then documenting Aaron’s early development!   Also, the part of me that is a “researcher into the nature of intelligence” was fascinated to observe Aaron’s development.  It was fascinating, for example, to see how the “simple” skill of turning over, actually is painstakingly learned through trial and error.  Aaron wanted to be on his stomach, since then he could move around a little.  When placed on his back he was “stuck” – but he tried and tried to figure out a way to turn over.  He would twist his back and extend his leg, and eventually (after many days and weeks of attempts) got his body flipped over, with the small residual problem that his arm would get tucked under his body, and he couldn’t get it out – this, too, required some learning to work around.  It’s fun to go back and watch this amazing learning process on the videotapes.

When Aaron was maybe a year old, I remember sitting him on my lap so he could “play” with a program I wrote for him on the Apple II+.  It was a relatively simple Basic program that would respond to any keypress by flashing random colored pixels on the monitor, and at the same time play random beep tones.  Aaron seemed to enjoy this, and soon he was whacking away at the keyboard!  Who knows how much influence this had on his developing into the highly-skilled software developer he is today !?

Not reappointed at Hampshire

My reappointment review did not go well.  My take on it is that my “stubbornness” in pursuing The Learning Community project was viewed as non-collegial, and irked my colleagues.  There may have also been some retaliation for things I candidly wrote during the earlier reviews of some other colleagues (but that’s only speculation on my part).  In general I attribute it to my political naivety and mis-handling interactions with my colleagues.  I was an idealist and maverick, and butted up against institutional conservatism (at Hampshire of all places) and “departmental politics”.

Not getting reappointed placed an excruciating stress on my marriage.  It nearly led to divorce.  My wife clearly was upset at the sudden removal of any long-term economic stability in our marriage, and things gradually worsened month by month.   I was not nearly as concerned – I had sought and found jobs before, and was reasonably confident I would do so again.  I had over a year of “cushion” to look for new opportunities, since my contract gave me a 4th year at Hampshire, even after the decision not to reappoint happened during my 3rd year.  I interviewed at several places, mostly in industry, as I recall, and by January 1985, had an offer from MITRE Corporation in Bedford, MA (at a salary more than twice what I received from Hampshire).  I’ll never forget my wife’s reaction to the news:  “Maybe I shouldn’t be so quick to divorce you”.   A mixed blessing at best.  It clearly indicated to me that the primary basis of our marriage was financial.  On the other hand,  I wanted to stay as closely involved with my son as possible, and dreaded the prospect of a divorce, so I was willing and satisfied to continue “working on the marriage”.

Moving to Lexington

I was scheduled to start work at MITRE on July 1, 1985.  My wife and I started looking for a place to live in the Bedford area.  She was naturally pleased to be moving even closer to her parents.  We initially looked at houses in Arlington, but were getting discouraged, and began thinking about renting.  Then “at the last minute” (June, I think) we looked at a house in Lexington.  We had considered Lexington to be out of our price range, but this house (a small ranch) seemed potentially manageable.  We put in an offer that was accepted, and were looking to move in August.   To complicate matters,  in the latter part of June, I attended a Machine Learning conference in Skytop, PA, where I was given a job offer to join GTE Laboratories as a Machine Learning researcher.  I had interviewed with GTE Labs back in January, and it was clearly my first choice, but they couldn’t extend an offer at that time.  So I faced a dilemma – but ultimately the choice was clear – I had to go with my 1st choice and accept the GTE offer, even though it meant backing out of my commitment to start work at MITRE. The Friday before July 1 (when I was to start work) was a rather momentous day:   I declined the MITRE offer (they were really nice about it!),  accepted the GTE offer (yeah!), and to top things off, my wife and I signed the Purchase & Sale agreement for the house in Lexington.  Wow!  So on Monday I started work at GTE Labs instead of MITRE.  Because we didn’t actually move until mid-August, I lived temporarily in a dorm on the campus of Bentley College in Waltham.

Things got better in my marriage during this period, and I was quite happy (at least initially) with my work at GTE Labs (more on this later), and I continued to be a very happy father!

to be continued …

Notes on a Life in Progress – part 3 (Graduate School 1974-1979)

In Fall of 1974 I entered MIT’s Graduate School.  This was the next step in my plan for a lifetime of research in AI/machine learning.  I was officially admitted through the Math Department, which served as my “host” department.  In fact, I was enrolled in an interdisciplinary PhD program through DSRE (the Division for Study and Research in Education).  The mechanics of this involved setting up an interdisciplinary committee to oversee my studies.  Actual requirements were then negotiated with my committee.  My committee included Seymour Papert (AI/ML, education, and developmental psychology), Susan Carey (cognitive psychology), and Dan Kleitman (mathematics, combinatorics).  My plan was to work directly for my PhD (skipping a Masters degree).

This setup seemed ideal!   In many ways it was like ESG at the graduate level.  I had tremendous freedom to define and pursue my own interests.  I was encouraged to explore multi-disciplinary interactions.  DSRE itself was set up as an interdisciplinary entity at MIT – including among its luminaries:  Seymour Papert (who was my graduate advisor),  Ben Snyder (psychiatrist), Don Schon (urban studies & creativity), and Jeanne Bamberger (music, education).   The unifying interest shared by all of these (and by me, too!) is in learning in all its forms, and how a deeper  understanding of learning can inform how we design education (create rich learning environments!).  I chose Seymour Papert as my advisor and mentor because we shared so many interests: understanding thinking and learning, math, computers, puzzles, and learning new skills. As part of the LOGO Learning Lab, Seymour encouraged everyone (both children and adults) to engage in novel and fun learning activities.  For example, circus arts such as juggling, pole-balancing, and bongo-board balancing were widely shared and explored.  We would not only learn these skills, we would analyze them and explore ways to teach them!  The same was true of puzzle-solving. Seymour, like me, was a puzzle enthusiast and collector.  We enjoyed sharing puzzles and discussing how we solved them.  One of Seymour’s memorable quotes is “You can’t think about thinking without thinking about thinking about something!”.  So basically anything at all that we spent any time thinking about became source material for thinking about how thinking worked.  I loved this kind of self-reflection.

Machine Learning was perhaps my central academic interest in pursuing my graduate studies.  It seemed clear to me that any true artificial intelligence must be able to learn, in order to adapt to new situations, as well as to extend it’s knowledge and skills.  Much of AI at the time worked in the paradigm of building a system to demonstrate a competence that was considered part of intelligence.  Examples included playing chess (Greenblatt), debugging programs(Sussman,, language understanding (Winograd).  The focus seemed to be on direct programming of skills and knowledge.  This approach, while certainly worthwhile for initial exploration, seemed too long and arduous a path to true machine intelligence, and if the resulting systems lacked learning capability, they would always be limited and brittle.  One exception was the thesis work by  Patrick Winston on machine concept learning  (the classic “Arch Program”).   This work was very influential on the direction of my research, and I ultimately added Winston as a co-Thesis Advisor (with Papert).

A Research Maverick

As I mentioned, pursuing machine learning ran counter to the dominant AI paradigm at the time.  Many people (faculty and fellow grad students) argued that it was “too difficult”.   Maybe it was difficult, but I was strongly convinced that it was the key to building AI.  If we could just build a general learning system, then we could educate it – let it learn the skills and knowledge we wanted it to have!  Of course, to make progress, it would be necessary to start with simple learning, and initially would not result in impressive performance.  Because most AI research was funded by DARPA (Defense Advanced Research Projects Agency), there was quite a strong pressure for researchers to generate impressive results.  I felt at the time (and still do in the present day!) that developing powerful AI systems required more of a basic research approach.  My thoughts on this were likely influenced by my mathematical training — I approached things from an abstract direction, wanted to understand basic core principles (Ross’s dictum: Think deeply about simple things!).  My ultimate intellectual goal was to develop a general and abstract theory of intelligence which would subsume both human and machine intelligence.  It occurs to me that my commitment to a learning approach to AI is analogous to the technique of mathematical induction (prove assertion true for n=1, and also if assumed true for a given arbitrary n then prove true for n+1).   The learning approach, admittedly challenging, seemed like a high-risk high-reward direction to pursue.  If successful, AI researchers would no longer have to work arduously to encode specific skills and competencies – the system could simply learn them!

Another dominant aspect of the prevailing AI paradigm was working on individual pieces of intelligence, for example planning, language understanding, game-playing, problem-solving, robotics, and even concept-learning.  These were all studied in relative isolation.  I heard little or no discussion regarding the overall architecture of intelligent systems.   The research approach was essentially bottom-up – build the pieces, and then figure out how to put them together.  I recall being struck by the research approach in nuclear physics to developing controlled fusion.  Yes, they focused on specific problems (attaining high temperatures, plasma containment, plasma density), but these sub-problems were studied the context of a set of alternative working models (e.g. Tokamak, and laser implosion).  AI didn’t have any working models for how an artificial intelligent system would be organized!   It struck me that there was tremendous heuristic value in having at least one working model — specifically to help focus research attention onto critical sub-problems, and at the same time help define the sub-problems by suggesting how the sub-problem solutions needed to interact with other (yet to be created) pieces.  One of the worst examples (to my mind) of the piece-meal approach was the work in Knowledge Representation, where there were numerous KRL (Knowledge Representation Language) proposals, but little attention to or work on the ways in which these systems would be used.  The CYC project also seems to favor this paradigm — let’s just encode lots of facts, and worry later about how to use them.  In knowledge representation work, a deep philosophical truth was (imho) overlooked — representation is a process!   Static symbols and data structures are not endowed with inherent meanings or representations.  It is the processes that interpret and work with those structures that are the key element in representation!   I sum this up in one of my favorite original slogans:

     No representation without interpretation!

My observation is that many philosophers don’t fully appreciate this.  I cringe when I hear discussions of meaning totally lacking any appreciation for all the processes (perception, interpretation) necessarily involved for meaning to exist at all.  It is seductive to imagine that words, for example, have inherent meaning, but the meaning cannot reside in the words themselves.  To have any real appreciation of meaning requires examining the social, cultural, perceptual, psychological, and learning processes that in effect attach meaning  to particular words and symbols.  But I’m straying from my topic (I plan to write at greater length on my philosophical thoughts at a future time).  Back to research strategies — whenever I suggested a top-down research approach (building integrated working models) The typical reaction I received was that “it’s just too hard and we don’t know enough at this point”.   I still think top-down is the “right way to proceed”, and I’m encouraged by the evolving sub-discipline of  cognitive architectures (examples include:  Langley’s ICARUS, and Laird and Rosenbloom’s SOAR architectures), but those weren’t developed until the 1980’s and later, and I think they still suffer a bit from “results pressure” from funding agencies [I wish there was more appreciation of and financial support for basic research].

One central personal learning goal for my graduate years was to develop my skills as a researcher.  It seemed essential to learn how to define  a research problem.  So when it came time to select a thesis topic, I used this as an opportunity to begin learning this skill.  I was not content to work on an “externally defined” problem — there were plenty of such problems that already had funding, and choosing one of those would have been the easy path.  Instead I generated a series of proposals, and the initial ones were overly-ambitious, and naturally I didn’t get very far with them.  One of my first ideas was to take Winograd’s SHRDLU (one of the great success of early AI, which demonstrated rudimentary language understanding), and work on a learning version of it.   This had the potential for a more integrated approach – it would integrate several sensori-moter modalities (hand-eye in manipulating a simulated blocks world, and language generation and understanding).  I even thought about having the system learn motor skills in the blocks world.  The problem with this is that it was way too difficult, and worse, tried to solve too many problems at once — it lacked focus.  It might serve well as a lifetime research project, but was not manageable as a  thesis (I hoped to finish my PhD before I retired or died).

I came to realize that I suffered from a serious “grandiosity” bug — I wanted whatever I did to be big, amazing, and spectacular, maybe even revolutionizing the field 🙂     What I needed was to simplify and focus on smaller, more manageable problems.  I think I also lacked the skill of working on large projects.  My training in mathematics and computer science had mostly consisted of working on smaller problems and projects.  The biggest project I had worked on was my Summer UROP research, but even that didn’t seem to scale up to a multi-year thesis project.  The thesis topic I finally settled on was “Extensions of Winston’s ARCH Concept Learner”.  I chose this because it was one of very few pieces of AI work that was centrally about learning, and also because I really liked the work itself (the way it used semantic nets to represent concepts, and the training paradigm of positive and negative examples).

A thesis is born

So I started out by writing a (friendly) critique (from an admirer’s point of view) of Winston’s concept learner.  I recall coming up with something like 6 directions in which the work could be extended, and my initial proposal was to work individually on each of these, and collect the results into my final thesis.  This had the heuristic advantage of dividing the “problem” into subproblems.   To further simplify, I selected just 3 of these extensions to work on:

1. Learning disjunctive concepts (Winston’s only learned conjunctive concepts)

2. Learning relational concepts (Winston had relational primitives, like ON & TOUCHES, but didn’t learn new ones)

3. Learning macro concepts (allowing any learned concept to be used as a primitive to represent and learn more complex concepts) (Winston’s work included some of this already, but I wanted to generalize it to cover disjunctive and relational concepts as well).

It was natural to have Winston as my (co)Thesis Advisor for this work, and I thank him for his patience, attention, and advice!

Masters thesis as “Consolation Prize”

By the end of my 5th year of grad school, I had only completed the 1st item (with a little preliminary work on item 2 as well).  It looked like another 2 or 3 years would be required for me to finish my planned thesis.  I was feeling frustrated, since my progress was much slower than I expected of myself, and I was losing self-confidence.   At the same time, my funding was running out.  To continue, I would have needed to start taking out loans.   I was very nervous about accumulating significant debt, and feared that even after a few more years I might still be unsuccessful at finishing.

So I decided to wrap up my work thus far as a Masters Thesis, collect my SM degree, and graduate and look for a job. My S.M. thesis was titled “Learning Disjunctive Concepts from Examples” and I think it was very solid piece of work.  I collected my SM degree in August 1979, and withdrew from my graduate program.  I had the intent of returning at some point to complete my PhD, but alas, that was not to be.

Non-academic threads of my life during the graduate years

During most of my graduate years I served as one of 3 graduate resident tutors in Bexley Hall (the undergraduate dorm I lived in when I was an undergrad myself).  I very much enjoyed both the social and mentoring aspects of this position, and have developed a number of lifelong friendships with students from those Bexley years!

I did not own a car during grad school, and don’t know where I could have parked it if I could have afforded one.  I did, however, purchase a motorcycle (a used Honda 350) which I learned to ride, and parked in the Bexley courtyard.  I had many interesting adventures riding to PA (to visit family) and New Hampshire, and also to Danbury CT to visit my first serious girlfriend when she moved back there.  I remember reading Zen and the Art of Motorcycle Maintenance and trying to apply some of its ideas to working on my bike.

I also purchased a used electric piano, which I enjoyed playing for many years.  Although I had written 1 or 2 songs in high school, I didn’t try more serious song-writing until I had my own piano.  I think I had fantasies of being in a rock band, and even auditioned at one point, but was turned down because the group felt my grad studies would prevent a full commitment to the band – I’m sure they were right. I still, to the present day, enjoy playing keyboards and writing songs.

My passion for puzzles continued unabated.  I added to my puzzle collection – my favorite puzzle store was (and still is) Games People Play located in Harvard Square.  I recall that it was around November 1979 when my wife got me perhaps the best puzzle gift I’ve ever received.  She went into Games People Play by herself and asked for “the hardest puzzle you have”!  Carol, the owner showed her a Rubik’s Cube, saying “we just got these in from Hungary, and it’s so hard he won’t be able to solve it”.   Of course, I couldn’t resist that kind of challenge, and after nearly a week of intensive work (I’d say roughly 15-20 hours over 5 days), I finally had developed my own solution method.  This was before Rubik’s Cube hit mega-popularity, and if anyone had suggested I should write a book on “How to Solve Rubik’s Cube” I would have laughed out loud at them!  This puzzle was so hard, that it would only appeal to a very small number of hard-core puzzle solvers (so I figured), and they are not the type to want to hear about anyone else’s solution (at least not until they had solved it themselves).  So I failed to cash in on the boom in Rubik’s Cube solution books — there were rough 4 or 5, I think, all simultaneously on the NYTimes top-10 non-fiction best seller lists for a time (1981?).  Just goes to show I’m a terrible market analyst!

I also had a number of relationships with women during these years, from which I learned a lot, and have mostly very positive memories!  In June of 1977 I met the woman I was to later marry.  We had a 2-year relationship which led to marriage in June 1979.  There was a lot going on in 1979 — In addition to getting married, I was writing up my SM thesis,  applying for jobs, accepting my first job, and moving to Pittsburgh — I’ll tell you more in the next installments on marriage and career.

… to be continued

Notes on a Life in Progress – part 2 (Undergraduate Years 1970-1974)

When it came time to apply to college, MIT was really the only choice that interested me.  I had read an article about the Artificial Intelligence Lab at MIT, and how they were working to program a robot to play with blocks.  I thought that was so cool.  I was already interested in A.I. as a result of reading Isaac Asimov and lots of other science fiction, which imagined the possibility of intelligent robots.   I also felt that machine intelligence might give humanity a little humility — I was getting tired of hearing how great humans were.  Examples: that humans were the “pinnacle of evolution” , the only animals with language, were the only creatures with “free will”, were devoid of instinctual behavior.  These claims seemed ridiculous to me.  Any thoughtful person would realized that evolution is a continuing process, and that we are just one stage in that process.  It also seemed clear to me that animals have varying degrees of intelligence and language, but humans seem to have a need to see themselves as “uniquely special”.   My plan was to study “cognitive science” (though that had not yet been defined as a discipline) by combining studies in computer science, mathematics, cognitive psychology, and neurophysiology.

MIT seemed like a great place to pursue these goals, so I applied.  I also applied to RPI and Michigan State (because I wasn’t certain I’d get into MIT, and there were remote chances of a full scholarship at these schools).   I was fortunate to qualify as a National Merit Semi-Finalist, and was eligible for an IBM Thomas Watson National Merit Scholarship, because my father worked for IBM.  I got into all three schools, and when I didn’t get a full scholarship to MSU or RPI, I happily chose to attend MIT.

MIT Experimental Study Group

Perhaps the single most transformative experience of my life occurred during my Freshman Year.   I joined and participated in the Experimental Study Group (ESG).  ESG is a special alternative educational program founded at MIT in 1969, so it was in its second year when I arrived as a Freshman.  I nearly missed out on this fantastic experience — I didn’t hear about until late in the summer, because I was away at the OSU Math program.  I came home to a deluge of accumulated mailings from MIT.  One of the mailings described a program (ESG) that sounded too good to be true – a program where students could design their studies, work at their own pace, choose their own textbooks, interact with faculty on a personal and informal basis.  Unfortunately, there were only a limited number (50) of openings available for incoming freshman, and preference was given to those who responded early expressing interest.  I was “certain” that a fantastic program like this would be oversubscribed – so I foolishly neglected to pursue it.   Later, after I arrived at MIT, and joined a fraternity (to try to do something about my pathetic social life), I completed one week of classes, and was mostly bored and frustrated.  Happily, one of my fellow fraternity pledges had joined ESG, and told me about it with great enthusiasm.  It turned out there were still openings!  I wasn’t going to miss out on this “second chance” so I immediately checked it out, talked with my advisor, and transferred into ESG!  Best decision of my life!

ESG was designed to give students the freedom to pursue their interests, and to work independently at their own pace.  There were no specific requirements, and freshman participants received 45 units of “free elective credit” per semester.  MIT’s graduation requirements included:

1. Amassing 360 units of credit (typically over 4 years)

2. “General Institute Requirements” in Math, physics, chemistry, and humanities

3. Completely the specific degree requirements of one’s major (corresponding to the department enrolled in)

4. A PE (Physical Ed.) requirement, along with passing a swim test

So the 45 “Free Elective Credits” of ESG put freshmen on track toward the 360 credit total.  We of course still had to satisfy all the other requirements at some time before graduation, but ESG allowed us to “defer” working on any requirements if we chose to.  I, like many of my ESG peers, choose to work on most of the typical freshman requirements, but to do so on my own terms and at my own pace.  I chose to work on calculus using the classic Apostol text, for example.  I worked on the 1st semester physics doing independent study from the notes for the “Advanced Mechanics” physics option (used more calculus).  There were also some interesting seminars offered by ESG faculty, and I joined the “Writing Seminar” offered by Peter Elbow, which was a terrific experience, and also satisfied 1 humanities requirement (the total humanity requirement was 8 semesters of humanities courses, with a distribution requirement (3 from different areas), and a concentration (of 3 courses within one area — I chose psychology).

ESG was a “learning community”

The ESG community consisted of 50 Freshmen, a number of sophomores (who had started out as ESG Freshman in 1969), faculty in the core areas of study (Math, Physics, Chemistry, Biology, Humanities), and administrative staff (director and secretary).  Everyone interacted informally and got to know each other on a first-name basis.   Our community space consisted of nearly the entire 6th floor of Building 24. There was a common room (with couches, chairs, and tables), a kitchen off the common area, a library, several seminar rooms, a lab and computer room, a music room (with turntable, stereo, and LP collection), and several seminar rooms.  The community was self-governing, with regular meetings, where everyone had a voice.  We had community lunches every Friday.  Basically people would “hang-out” around ESG (pretty much 24/7, though the faculty and staff were around primarily during daytime hours).  Interactions were serendipitous, and led to a stimulating free-flow of ideas, interests, and expertise.  Everyone was consider a “learner” – faculty were just senior (more experienced) learners!

One of the memorable dynamics I recall was the “ad hoc seminars”.   There was a bulletin board outside the administrative offices, strategically placed at the entrance to the ESG 6th-floor area.   There was an area on this bulletin board where any community member could post a “seminar proposal” — e.g. “I’m interested in learning about X – who would like to learn with me?”  Then anyone could sign up, and those interested would form a “study group” to share their learning and experience on the topic.  Some seminars never got off the ground, but others took on a life of their own, leading to fascinating explorations and learnings!

ESG blurred artificial boundaries

With traditional education, I chafed at what I consider “artificial boundaries”:

1. Temporal boundaries (compartmentalizing learning into courses with fixed beginning and end times)

2. Content boundaries (compartmentalizing learning into fixed course content chunks)

3. Class/role distinctions between teacher and student.

4. Academic vs. Non-academic learning and interaction

At ESG, these boundaries were loosened — learning could go on for as long (or as short) as interest continued.  Content of learning wasn’t necessarily chopped up into distinct disciplines, rather interdisciplinary exploration and learning was natural and easy.  As I mentioned, faculty and students were simply all learners, and the informal interactions facilitated sharing and learning.  Learning could be in any area, not restricted to traditional academic areas, for example, I learned to play the game of Go (spending an intensive week doing little else), and skills at lock-picking were shared and learned via informal interactions.

ESG surprised me by providing advantages beyond my expectations

I joined ESG primarily to gain greater individual freedom in pursuing my education.  What I found was that there was tremendous power in the learning community model!   ESG was a close-knit social community, that became almost like family, and was far more important in its influence on me than mere freedom.  I found that ESG become my “social group”, so much so that I dropped out my fraternity and moved to the dorms.  We sometimes joke about ESG being “Epsilon Sigma Gamma” – because it was like a fraternity in many ways.

Learning about Collaboration   

A very important lesson I learned through ESG was the power of collaboration.  Pre-college, I took a very individual approach to life (I didn’t have collaborators that shared my interests), and I prided myself on doing things on my own — I enjoyed the satisfaction of working hard and solving a difficult puzzle, and if someone gave me a hint or suggestion, I’d often feel cheated that I couldn’t do it “all on my own”. At ESG, I had many opportunities to work on problems and puzzles with others, and I found that there could be tremendous pleasure from a collaboration where each party was making contributions!  Moreover, via collaboration it was possible to do so much more (and to learn faster) than I could working strictly on my own.  One striking example was taking Winston’s “Introduction to Artificial Intelligence”, which a group of us (numbering 10 or more) from ESG took “together”.   I learned the material so much better by working on it as part of this group.  We even collaborated on problem sets and Exams (with the instructor’s blessings).   Note that this was a course offered through the “regular curriculum” as we referred to non-ESG courses, since ESG did not have a computer-science faculty member.  Because I did well in the course (Freshmen were on Pass-Fail, but my “hidden grade” was an “A”), I got to meet with Prof. Winston, who encouraged me to pursue my AI interests – he later became my Thesis Advisor in grad school.  I will explore this power of collaboration much further in future posts, but perhaps the single most significant advantage is having help readily available to get un-stuck whenever one hits a sticking point.

Extracurricular activities

I pursued a number of enjoyable “extracurricular” activities while an undergrad.  I took both Sailing (on the Charles River) and Folk-dancing as PE classes, and continued to enjoy both for many years to come.   I was actually surprised at how much I enjoyed folk-dance!  I was very much a rebel by nature, naturally challenging authority, and resisting external structure.  Folk-dance, on the other hand, involved a fair amount of structure — there were prescribed sequences of steps to perform, and not a lot of room for individual interpretation.  Nevertheless, the structure of dances is generally very hierarchical (there are basic patterns, and these get assembled into different groups for both verses and chorus of dances, and these verses and chores repeat in yet a higher-level structure.  This struck me as similar to the structure of computer programs, with subroutines written from basic language elements, and being assembled into larger subroutines and full programs – which I’m sure was a large part of the appeal.  I also loved the music (especially of traditional Israeli dance), and the collaboration involved in dancing as part of a group, or with a partner.  I also enjoyed the social aspect of the activity, it was a low-key, informal, and pleasant way to meet women!  In fact, it was through folk-dancing that  I met the woman who later became my wife and mother of my 3 children (but that story comes later).

Majoring in Math

I chose to major in mathematics.  Partly because I loved math and wanted to learn a lot of it, but also because it had fewer requirements than the alternative of Computer Science.   I wanted to continue pursuing my interdisciplinary studies toward the goal of working in AI and machine learning, and the math requirements gave me greater freedom to engage in a broader program of studies.  I have fond memories of learning a lot of abstract math (Abstract Algebra, Linear Algebra, Analysis, Point-set Topology, Algebraic Topology, Combinatorics, and more).

I particularly came to love Combinatorics, especially enumeration and graph theory.  One of the most influential courses I took was Intro to Combinatorics taught by Prof. Dan Kleitman.  The first day of class blew my mind.  He presented Cayley’s theorem for enumerating trees: the number of distinct trees on N labelled vertices is exactly N^(N-2) [that’s N taken to the power N-2].  This formula seemed amazing – it was beautiful and simple!  He gave an interesting proof (I’m not sure which one) and over the next days I worked to come up with my own proof, by creating a 1-1 correspondence between labeled trees and (n-2)-tuples of numbers taken from the range [1 to n].  It is easy to see that there are n^(n-2) such tuples, so establishing a correspondence of trees with tuples proves there are exactly n^(n-2) trees. My interest in Cayley’s Theorem led to a summer research project (UROP – Undergraduate Research Opportunities Program) exploring generalizations of this theorem to “higher dimensions”.  This led to my first mathematical publication, in Discrete Mathematics Journal, jointly authored by me and my advisor.  I often wonder what I would have learned if I had continued to regularly attend Kleitman’s course, but sadly I neglected going to “lectures” which were scheduled early in the morning, and I was already a “night-owl” who liked to sleep in. I only managed to attend 2 or 3 additional classes, solely for the purpose of handing in the 3 required project assignments [my first project was a programming project that computed the correspondence between trees and their tuple-encoding. I learned later that the German mathematician Prufer had first discovered such an encoding — mine was different and original yet similar].  If I had a “do-over” I’d make sure I attended every single one of Kleitman’s lectures in that course!   Kleitman later became one of the members of my “graduate school committee” (representing math in my interdisciplinary grad program).

I think it was my sophomore year that I first took the William Lowell Putnam Exam (college level competition in mathematics).  I was not on the MIT team, but anyone could sign up to take the exam and compete as an individual.  I remember being totally peeved at my dorm roommate who “kicked me out” of my room on the night before the exam because he was having a girl stay over.   I ended up finding a place to sleep in a Radcliffe dorm, courtesy of a female friend, but the result was that I overslept!  When I woke up, I had to scurry to get to MIT, and start the exam.  I arrived almost 2 hours late for the 3-hour morning session.  There was also, if I remember, a 2nd 3-hour afternoon session.  I was really tired, and rushed through the morning exam (6 questions, of which I answered maybe 2 or 3 reasonably, and did some hand-waving on a few of the other, which might have garnered some partial credit, who knows).  I had the full 3 hours in the afternoon, but still could only answer a few of the questions — they are designed to be extremely difficult and challenging.  Imagine my surprise when I learned that, despite my lack of sleep and tardy arrival, I received an Honorable Mention  which meant I was ranked in the top 50 scorers, nationally. Based on this performance, I was invited to join the MIT team the following year, and I made sure to get more sleep, and show up on time – nevertheless, I did terribly, and felt bad that I hurt the team’s overall ranking.  I think I ranked somewhere just above 300, but my memory is cloudy at this point.  Maybe there are advantages to taking an exam with less time and less sleep!

Approaching graduation

As graduation approached, I became very nervous about my future.  Graduate school was the natural next step in pursuing a career in AI/machine learning, but what if I didn’t get accepted?  There were only a few major centers where AI work and studies were possible around 1974, so I applied to the “big 3” in the U.S.:  MIT,  Stanford, and CMU.   I was relieved, and extremely pleased to be accepted at all 3!  I was honored to receive a personal phone call from Don Knuth (Stanford) informing me of my admission, and encouraging me to come to Stanford.  I remember vividly that the call was on a Saturday morning, and that it woke me up  — so I was no doubt somewhat groggy.  Still, I was really excited!   Knuth, for those of you who don’t know, is famous for his multi-volume  series of books “The Art of Computer Programming”.   It wasn’t until 2009 that I met him in person (at the International Puzzle Party in San Francisco), and was able to thank him for honoring me with that call!   I also got accepted to both CMU and MIT, and was offered a full-fellowship at MIT through the interdisciplinary DSRE (Division for Study and Research in Education), where I’d have the opportunity to work with Seymour Papert, co-director of the MIT AI Lab.  Papert had a broad range of interests, including math (he did seminal work with Minsky on Perceptrons), education (he helped create and promote the LOGO computer language for children’s education),  developmental psychology (he had worked with Piaget in Geneva), and puzzles (he was an avid puzzle collector and solver, and loved “thinking about thinking” often using puzzles for that purpose).  Working with Papert seemed like the best fit to my interdisciplinary interests, and having a full-fellowship rather than Teaching or Research Assistantships seemed like a big plus.  The interdisciplinary program in DSRE provided an almost ESG-like freedom to design my own course of studies — I’d get to work with my committee to define all my own requirements! Finally, I was already familiar with MIT, and the Cambridge/Boston environment, so I figured I could “hit the ground running”.  I accepted the MIT / DSRE Fellowship offer and the next stage of my career was mapped out.

MIT Graduation

I received my SB Mathematics degree from MIT in 1974.  Amazing even myself, I had achieved a 4.9/5.0 gpa, and graduated Phi Beta Kappa.  Embarrassing was the fact that my only “B” was in a math course — the fact that I hadn’t completed all the homework was held against me 😦   Nevertheless, I was proud of my accomplishment, and my parents and siblings came to attend my graduation!

To be continued … next up: Grad School at MIT

Tag Cloud