SANS Index Creation and Study Guide

The Index

All GIAC exams being open book is not the blessing it seems to be. There are generally 8-9 books (including the lab workbooks) and searching for something can be like finding a needle in a haystack. I have a very rudimentary process for creating an index that I have used for the last 4 exams and it has worked well for me. I am not saying this is the best way to do it, but it is the way that works for me.

The Included Index

Depending on the author of the course, there may be an existing index included in the last pages of the workbook. Most of the SANS classes I have taken up to this point have included this index, however it is to the course author’s discretion. It can be good enough to pass the class, but if I took a class that did not have an index I would recreate the list of topics and what page number and book they appear in. This basic index is helpful for hail mary searches during an exam where you have to find all the mentions of a topic very quickly. I find it hard however to parse the context of something like this:

Topic	Locations
Azure AD	3.111, 4.21, 5.161
AWS S3	2.11, 3.21, 4.161

In the middle of the test I have to think about what day of class and what book that correlates to. Then I have to find the context of the question I am researching and hope that my guess is close enough to confirm the answer.

My Custom Index

Instead of tying a topic to a simple page number, my personal index takes a different approach. I create a spreadsheet with the following columns:

Book	Page	Slide Title	Keyword 1	Keyword 2
1	1	Introduction to Slide Topic	Azure	AD
1	2	Introduction to Slide Topic	AWS	S3

This allows me to take each book and have a quick way to reference slide titles and the potential data located on each slide. The keywords are made up of the paragraphs of content underneath the slides.

I do this for every book. When I am done I have a spreadsheet with sheets for each book. You could print the various sheets but I find it hard to parse the information in this format.

Indigo - XLSX to Markdown

I created a tool called indi-go that takes the xlsx file and converts it to markdown. I then print the markdown files as PDFs. This allows me to have a parseable index that I can scan for keywords and slide titles.

Practice Exams

The graduate certificate price includes two practice exams. I take the first practice exam with no assistance from any index. This test is a baseline to see where I am falling short in certain areas. I then take the second practice exam with the index. This is to see if I can find the information I need quickly and accurately. If I run into problems parsing my notes I can make adjustments to my system before I sit for the final exam.

The Final Exam

SANS retakes are notirously expensive. Since I don’t want to pay harder for another shot at the test, I make sure to run through all the labs for the class once or twice before I sit. My notes also include cheatsheets for the lab exercises that include many of the commands that are used.

Adding a Word Cloud

Since I had digital PDFs on my last exam (SEC541), I decided to play around with running some pandas and matplotlib in python to scrape the PDFs for each day and create a word cloud to represent the contents of the book. This was wayyyy harder to do well! I eventually was able to clean the extracted PDF data of all watermark text, uuids, home addresses and the like. With that I was able to write a simple script to display the data. It’s a chonky boy and includes some PII so I don’t have it up on Github, here is the generate_wordcloud() function I wrote:

def generate_wordcloud(text, exclude_words=None):
    if exclude_words is None:
        exclude_words = []

    nlp = spacy.load("en_core_web_sm")
    doc = nlp(text)

    # Extract proper nouns (NNP) and compound proper nouns (NNP + NNP)
    proper_nouns = [token.text for token in doc if token.pos_ == "PROPN"]
    compound_proper_nouns = [token.text for token in doc.noun_chunks if token.root.pos_ == "PROPN"]

    # Combine proper nouns and compound proper nouns
    filtered_words = proper_nouns + compound_proper_nouns

    # Remove single-character words and common irrelevant words
    irrelevant_words = ['is', 'the', 'and', 'or', 'of', 'in', 'for', 'on', 'as', 'to', 'at', 'by', 'with', 'from', 'into', 'during', 'including', 'until', 'against', 'among', 'throughout', 'despite', 'towards', 'upon', 'concerning', 'to', 'in', 'for', 'on', 'by', 'about', 'like', 'through', 'over', 'before', 'between', 'after', 'since', 'without', 'under', 'within', 'along', 'following', 'across', 'behind', 'beyond', 'plus', 'except', 'but', 'up', 'out', 'around', 'down', 'off', 'above', 'near']
    filtered_words = [word for word in filtered_words if len(word) > 1 and word.lower() not in irrelevant_words]

    # Remove specific phrase "SEC Monitoring"
    filtered_words = [word for word in filtered_words if word != "SEC Monitoring"]

    # Join the filtered words back into a string
    filtered_text = ' '.join(filtered_words)

    wordcloud = WordCloud(width=800, height=800,
                          background_color='white',
                          min_font_size=10,
                          stopwords=set(exclude_words)).generate(filtered_text)

    # Print the words going into the word cloud
    print("Words in the word cloud:")
    for word, frequency in wordcloud.words_.items():
        print(f"{word}: {frequency}")

    plt.figure(figsize=(8, 8), facecolor=None)
    plt.imshow(wordcloud)
    plt.axis("off")
    plt.tight_layout(pad=0)
    plt.savefig("wordcloud.png")
    plt.show() # if you are in a notebook you can use plt.show() to display the image.

I am about as good a python developer as I am a mechanic (that is to say: not very good), but I was eventually able to hammer something into place that met my needs.

Final Thoughts

Creating a comprehensive index for SANS/GIAC exams might seem tedious, but it’s an investment that pays dividends. The process of building the index helps solidify concepts in your mind, making you more familiar with the material even before exam day. Having spent hours with each book, you develop an intuitive understanding of where specific topics are covered.

My approach combines:

A detailed spreadsheet tracking book numbers, page numbers, slide titles, and keywords
Converting this to easily parseable markdown/PDF format using my custom tool
Supplementing with visualizations like word clouds to provide another perspective on content focus
Strategic use of practice exams to validate and improve the index

This method has helped me pass four GIAC exams so far while managing exam anxiety. The peace of mind from having a well-organized index is invaluable when facing questions about obscure topics buried somewhere in 1000+ pages of material.

Remember that creating an index is part of the learning process itself. The time spent organizing the information is not wasted—it’s an active form of studying that will serve you well during the exam and beyond in your career.

Good luck with your SANS courses and GIAC certifications. The exams may be challenging, but with proper preparation and a solid index, you’ll be well-equipped to succeed.