Attack Surface Intelligence of Source Code

Anuncio
Attack Surface Intelligence of Source Code
ME & VULNEX
Simon Roses Femerling
•
•
•
•
Founder & CEO, VULNEX www.vulnex.com
@simonroses
Former Microsoft, PwC, @Stake
Black Hat, RSA, OWASP, SOURCE, AppSec, DeepSec, TECHNET
VULNEX
•
•
•
•
CyberSecurity Startup
@vulnexsl
Services & Training
Products: BinSecSweeper (Binary Analysis)
TALK OBJECTIVES
• GCC & Python, hand to hand
• Transformations: source code to
useful data
• Practical code understanding
WORK IN PROGRESS
AGENDA
1. The need of Attack Surface
Intelligence of Source Code
2. GCC Overview
3. GCC-Python-Plugin
4. Source Code Intelligence
5. Tintorera Overview
6. Tintorera Analysis Demos
7. Conclusions
8. Q&A
1. CODE IS GETTING COMPLEX!
Software
SLOC
Firefox
14 Million
Windows Server 2003
50 Million
Debian 7.0
419 Million
Mac OS X 10.4
86 Million
Linux Kernel 2.6.25
13.5 Million
Linux Kernel 3.6
15.9 Million
1. DOCUMENTATION
1. TYPICAL CODE REVIEW
1. WHERE TO START?
•
•
•
•
•
•
File operations
Networking
Processes
Crypto
Authentication
??
1. TOOLS?
2. GCC
• Compiler system that supports various
programming languages
• Popular UNIX variants
• Supports all major languages: C, C++,
Java, Objective-C, etc.
• PLUGINS!!
• FREE
2. GCC INTERNALS
http://www.airs.com/dnovillo/Papers/cgo2007-gcc-internals.pdf
2. GCC TERMINOLOGY
• GENERIC is common representation
shared by all front ends
– Each parser must emit GENERIC
• GIMPLE is a simplified version of
GENERIC
– 3 address representation
– Simplified control flow
• RTL (Register Transfer Language),
assembler for an abstract machine
2. GCC PASSES
http://gcc-python-plugin.readthedocs.org/en/latest/tables-of-passes.html
3. GCC-PYTHON-PLUGIN
• GCC plugin that embeds Python in
GCC 
• Now your Python script can access
GCC passes and perform analysis
• Developed by David Malcolm
(Fedora)
http://gcc-python-plugin.readthedocs.org/en/latest/
3. GCC-PYTHON-PLUGIN EXAMPLE
3. GCC-PYTHON-PLUGIN DEMO
3. GCC-PYTHON-PLUGIN IDEAS
• Write scripts for:
– malloc/free usage
– Array boundary checks
– Code visualizations
– You name it!
4. CODE UNDERSTATING
•
•
•
•
•
•
What API are being used?
Number of functions?
Inputs / Outputs of functions?
Function relationship
What comments said?
Code complexity
4. CODE METRICS
• Controversial topic but needed
• Metrics:
– Function complexity
(Cyclomatic)
– Number of:
•
•
•
•
Lines
Code
Blanks
Comments
– Line Length
– Number: Bugs per Line
– You name it….
4. CODE COMPLEXITY
• Counts the number of linearly
independent paths through the source
code
• Basically we can have an idea of the
complexity of functions
• Complexity is security enemy!
• Created by Thomas McCabe
http://www.literateprogramming.com/mccabe.pdf
4. CODE COMPLEXITY THRESHOLD
http://www.sei.cmu.edu/reports/97hb001.pdf
4. SOURCE CODE ANALYSIS FLOWGRAPH NOTATION
www.mccabe.com/ppt/SoftwareQualityMetricsToIdentifyRisk.ppt
4. SOURCE CODE VISUALS TOO
BINARY
SOURCE CODE
5. TINTORERA – BLUE SHARK
• “Put source code into context”
• Objective: Get a feeling of the code while
compiling!!
• Intelligence of source code:
–
–
–
–
–
Code visualizations
Comments analysis
API identification
Metrics
HTML Reports
• C code transformed to JSON files, now you can
query and perform analysis on data
5. TINTORERA INTERNALS
• Two files:
– analyzer.py: To be used while compiling a
project
– do_report_tintorera.py: Use after project
has been compiled to generate report
• Composed of:
– Python code
– JSON data files
– HTML / CSS / Javascript
5. TINTORERA STRUCTURE
• Python files
• Folders:
– data/ : API JSON file
– templates/ : HTML templates
– js/ : Javascript code
– images/
– Tintorera_lib/ : python code
5. TINTORERA INSTALL & USAGE
1.
GCC version 4.7 or later
2.
Install gcc-python-plugin (See web doc)
3.
Set path:
4.
1.
Export LD_LIBRARY_PATH=/gcc-python-plugin/gcc-c-api
Add line to Makefile (CC= tag)
1.
gcc –fplugin=/gcc-python-plugin/python.so –fplugin-arg-python-script=/tintorera/analyzer.py
5.
Run make
6.
After compile use:
1.
Python do_report_tintorera.py –c tinan.cfg
5. TINTORERA CONFIG FILE
• Edit tinan.cfg to suit your needs
• Set parameters such as:
– Folder to save analysis report
– Enable / disable analysis
•
•
•
•
•
Basic blocks
Callgraphs
Comments
Gimples
Etc.
– Cyclomatic Thresholds
5. TINTORERA DATA FILES
• Folder: /data
• File: tinto_api.json
• JSON file to define APIs
5. CODE TRANSFORMATION
SOURCE CODE
JSON
FILES
HTML
REPORT
5. TRANSFORMED JSON FILES
• 3 files:
1. tintorera_bb_file.json: code basic
blocks
2. tintorera_meta_info.json: general
information, file size and code &
comments not inside functions
3. tintorera_temp_file.json: functions
information
5. TINTORERA_BB_FILE.JSON
5. TINTORERA_META_FILE.JSON
5. TINTORERA_TEMP_FILE.JSON
5. TINTORERA SOURCE CODE METRICS
• Current metrics:
1.
2.
3.
4.
5.
6.
7.
Number of:
1.
2.
3.
4.
5.
Lines
Code
Blanks
Comments
Colons
Average line length
Minimum line
Maximum line
Total Basic Blocks
Total Cyclomatic Complexity
Average Cyclomatic Complexity
5. SOURCE CODE COMMENT ANALYSIS
6. DEMO I: LOOP TESTER
6. DEMO I: LOOP TESTER
6. DEMO I: LOOP TESTER
IF ELSE
WHILE
SWITCH
6. DEMO II: SENDMAIL CRACKADDR (CVE2002-1337)
Pure Complexity….
6. DEMO II: SENDMAIL CRACKADDR (CVE2002-1337)
FUNCTION COMPLEXITY
6. DEMO II: SENDMAIL CRACKADDR (CVE2002-1337)
FUNCTION COMPLEXITY
6. DEMO III: MONGOOSE WEB SERVER ANALYSIS
• Mongoose is the most easy to use web server on the
planet. A web server of choice for Web developers
(PHP, Ruby, Python, etc) and Web designers.
6. DEMO III: MONGOOSE WEB SERVER ANALYSIS
6. DEMO III: MONGOOSE WEB SERVER ANALYSIS
6. DEMO III: MONGOOSE WEB SERVER ANALYSIS
6. DEMO III: MONGOOSE WEB SERVER ANALYSIS
6. DEMO IV: BOA WEB SERVER
Boa, a high performance web server for Unix-alike computers
6. DEMO IV: BOA WEB SERVER
6. DEMO IV: BOA WEB SERVER
6. DEMO IV: BOA WEB SERVER
6. DEMO IV: BOA WEB SERVER
6. DEMO IV: BOA WEB SERVER
6. DEMO V: OBFUSCATED C CODE ANALYSIS, ENDOH4.C
The International Obfuscated C Code Contest - http://www.ioccc.org/
6. DEMO V: OBFUSCATED C CODE ANALYSIS, ENDOH4.C
The International Obfuscated C Code Contest - http://www.ioccc.org/
6. DEMO V: OBFUSCATED C CODE ANALYSIS, ENDOH4.C
The International Obfuscated C Code Contest - http://www.ioccc.org/
6. DEMO V: OBFUSCATED C CODE ANALYSIS, ENDOH4.C
O function
The International Obfuscated C Code Contest - http://www.ioccc.org/
6. DEMO VI: OBFUSCATED C CODE ANALYSIS,
MISAKA
The International Obfuscated C Code Contest - http://www.ioccc.org/
6. DEMO VI: OBFUSCATED C CODE ANALYSIS,
MISAKA
The International Obfuscated C Code Contest - http://www.ioccc.org/
6. DEMO VI: OBFUSCATED C CODE ANALYSIS,
MISAKA
MAIN
The International Obfuscated C Code Contest - http://www.ioccc.org/
Z
7. DRAWBACKS
• gcc-python-plugin needs more work,
fails many times
• So do Tintorera…
• Only C / C++ code
7. CONCLUSIONS
• Tintorera helps to analyze C code
faster & better
• Practical code understanding for:
– Saving time
– Security reviews
– Fuzzing: what and where to fuzz
7. NEXT STEPS
• Better & focused analysis (security, etc.)
• Vulnerabilities
Detection
• More metrics
• Code Diff
• Cooler reports!
• Other languages ¿?
8. Q&A
• Thanks!
• @simonroses / @vulnexsl
• www.vulnex.com
Descargar