## Texify: Hands-Free EECS 203 Templates 📓

#### A tool to help kick-start EECS 203 assignments.

or: documenting the process of developing of one of my very first projects

The winter of my freshman year, I took Discrete Mathematics (EECS 203 at Michigan). From a professor’s recommendation, I also started to learn LaTeX.

If you don’t know, LaTeX is a complicated but powerful document markup language. It’s used as the standard in countless academic papers. Also in tedious homework assignments.

Below, you can see an example of a typical EECS 203 assignment.

Learning to use LaTeX was hard - even making the boilerplate takes a decent chunk of time. I wanted to find a way to minimize the busy-work so that I can spend more time on the assignment.

Here’s an example of bare-bones LaTeX boilerplate

\documentclass{article}
\usepackage{graphicx}

\begin{document}

\title{Introduction to \LaTeX{}}
\author{Author's Name}

\maketitle

\begin{abstract}
The abstract text goes here.
\end{abstract}

\section{Introduction}
Here is the text of your introduction.

\label{simple_equation}
\alpha = \sqrt{ \beta }

\begin{figure}
\centering
\includegraphics[width=3.0in]{myfigure}
\caption{Simulation Results}
\label{simulationfigure}
\end{figure}

\section{Conclusion}

\end{document}


Which looks like this:

When considering the assignment, the main challenge was to extract the problem features from the .pdf

I found a tool online called tabula that can parse data tables in pdf files, as well as a python wrapper to convert the table into Pandas. Perfect.

LaTeX, like most coding languages, is picky. There are two main “environments” - math and text. There are some math symbols that can only be used in math mode. For example, \$ do \sum math

Since this is only for extraction and display purposes, most symbols can be simply removed:

for c in r"""."'^“”/√_""":
df['Problem'] = df['Problem'].str.replace(c, '')


Regular expressions and splitting can be used to parse out the pandas rows

# Extract numbers and problem
df['Number'] = df['Problem'].str.extract('(^\d+\.?\d*)', expand=False)
df['Parts'] = df['Problem'].str.extract('^\d+\.?\d*\s*([a-z,]+)', expand=False)

# Split 'Parts' by comma into a list
df['Parts'] = df['Parts'].str.split(",")


Now, I had all of my material. I just needed a way to write it. I made my own hacky syntax for templating: if the line contained !split, then write the problem data.

Note: more advanced templating using engines, which I have not been aware of, such as Jinja2 or Liquid could have been used.

def write_tex(df, template_path, output_dir, output_file):
"""Write new output file with dataframe and template.txt"""
output = open(os.path.join(output_dir, output_file), 'w')

# Write output file
with open(template_path, 'r') as template:
for line in template:
if line == "!split\n":
for i in df.index:
output.write(make_problem(df["Points"][i],
df["Section"][i],
df["Number"][i],
df["Problem"][i],
df["Parts"][i]))
else:
output.write(line)
output.close()


Putting it all together with attractive boilerplate LaTeX, the program spits out

\begin{document}

\title{ EECS 203: Discrete Mathematics \\
Homework [Homework Number] (Winter 2017)}
\date{Due: Now}
\author{[Name]}

\maketitle

\begin{questions}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\question[1.0] Section 9.2 Problem 12 \\
“What do you obtain when you apply...”
\begin{solution}\\

\end{solution}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

.
.
. Questions here
.
.

\end{questions}

\end{document}



which will look like this