This page outlines the skills I think one needs to be productive in my lab. My choices reflect my research focus (computational linguistics and applied mathematics) and training (PhD in computational neuroscience and MD specializing in toxicology). You don’t have to know everything here. Someone focusing on linguistics doesn’t need to know about Hilbert or Banach spaces. Whichever parts are most relevant to your project, mastery comes from directed learning and focused persistent practice.

My Overall Philosphy.

science educates humanity, in the original meaning of the word educates. the pursuit of science is a rewarding calling with purpose and intellectual fulfillment. the business of science is destructive, feeding and empowering a scientifically illiterate administrative class. administrative parasitism is not a uniquely modern concern nor is science’s reliance on someone else to make the money1. galileo lauded his patrons at the beginning of his books, usually for about 40-50 pages. To succeed in modern science one needs luck, grit, talent, intellectual honesty, and realpolitick, in that order.

I strive to create a lab where people join because they are interested and inspired in joining me on a mutual voyage of discovery. I expect those working with me not to misrepresent their talents or interests to get a spot nor to phone anything in. (Phoning in is 90’s argot that means to do a poor job because you didn’t really try. Perhaps in the era of the smart phone everyone is phoning in.) This expectation comes partly because I think interesting things happen when smart creative people are given the time to be freely productive and partly because I don’t have the bandwidth or inclination to micromanage.

Table of Contents

Fundamentals Everyone Should Know

  1. Facility with Python, Prolog, LiSP, Julia, or Haskell. In at least one of these languages you must be able to load a data file, do some analysis, and write the results to a graph or table.
  2. Facility with Markdown (for describing issues on GitHub and drafting manuscripts.)
  3. How to write a scientific paper [for MS and above].
  4. How to write a 250-word abstract [for college students and above].
  5. Anatomy of a Specific Aims Page. The Specific Aims page is the crystallization of your proposal, similar to but far more detailed and precise than an executive summary. All grant applications, whether to private foundations or the NIH, DoD, or NSF, hinge on the Specific Aims page.
  6. GitHub. I use GitHub to organize and share code.
  7. Missing Semester. This MIT course teaches you the day-to-day tools you need to do computational science, for example how to move and rename masses of files, quickly edit files with vim, or make a build system.
  8. LaTeX. LaTeX makes beautiful preprints and technical papers. I use Beamer for almost all my presentations. I write as many of my grants in LaTeX as possible. IEEE conferences accept TeX submissions. Biomedical journals sometimes do.

For Computational Linguists

My focus is on representing biomedical knowledge in a computable format and on developing tools that use those representation to contextualize (and especially assess the plausibility) of knowledge expressed in unstructured text.

Knowledge Representation

  1. Basic Formal Ontology. BFO is a framework for creating first order logic predicates that, when taken together, express a consistent picture about a “portion of reality”. It’s a commonly used standard.
  2. Markov Logic Networks. MLNs express uncertainty and are thus more expressive than BFO. It is an area of ongoing interest to join a BFO-compliant ontology and MLN schema.
  3. SpaCy and Prodigy. They are our NLP backbone.

For Mathematicians

Spatial Statistics

Agent-Based Modeling

  1. Science relying on commerical success, possibly with a government intermediary, is an issue in economies where capitalists horde money, and, hence, means to accomplish things in that type of economy. That isn’t the only way to do things. It is the way in the US. To do science in the US, we must acknowledge the rules of the game.