Late 2019 Note: This is a post I had sitting in my draft folder since 2015. On review, I think it’s complete enough to publish as-is for historical perspective. Bandit has since moved out of the OpenStack umbrella but it’s still a useful way to find security holes in your applications.
It’s hard to keep track of all the ways we can shoot ourselves in the foot. In C, they had stack overflow problems if you didn’t check your memory bounds. Eventually, the tooling (and compilers) caught up and started offering sane defaults.
In Python, we don’t have a compiler, so we don’t have any compile-time checking. Static analysis is also notoriously hard to do with a dynamic language. It’s only been recently that we’ve even seen the introduction of types in Python with the changes coming in Python 3.5 for static typing. As such, finding any security holes in Python is, by its nature, a challenging task.
Bandit is an open source tool that runs security checks for all the most common insecure Python functions in your code and gives you an output of the results.
Let’s cover how we’d use Bandit, some results I found while testing it out, and some other thoughts on how to use Bandit to improve the security of your code.
How to Use Bandit
Their README is helpful with walking through all the options. Basically, when Bandit runs, it will scan a directory of your choice and look for a set of pre-defined vulnerabilities. If it finds them, it reports them to you. You can specify the levels of feedback you’d like from the tool, ranging from “show me all the things!” to “only report something if it could blow up my car”.
What does it check?
There’s not currently a comprehensive list of what Bandit checks. From looking through the list of plugins, here’s a few items that are getting checked by default:
- Items falling within a set of blacklisted imports, calls, or functions are used. Defaults include md5, eval, mark-safe, pickle, yaml, etc.
- Exec called.
- Hardcoded or non-secret passwords
- Shell injections.
- Hardcoded SQL statements
- Calling Linux commands with wildcards (
*
) - Requests with certificate validation turned off
- Bad SSL settings
- Using try / except / pass
The closest thing to a comprehensive list can be found by looking at the
default bandit.yaml
configuration file.
Anything in the profiles
:All
:include
section are run on each run by default
and the blacklist
sections call out some of the insecure calls they’re
looking for.
Personal Results
I ran this on an old program that I wrote, out of curiosity.
kevin$ bandit -r ~/programming/my_old_app
[bandit] INFO using config: ~/.virtualenvs/bandit/etc/bandit/bandit.yaml
[bandit] INFO running on Python 2.7.6
Run started:
2015-09-11 20:09:47.141203
Test results:
>> Issue: Consider possible security implications associated with subprocess module.
Severity: Low Confidence: High
Location: ~/programming/my_old_app/controllers/old_controller.py:1
1 import subprocess
2 import threading
>> Issue: subprocess call with shell=True identified, security issue.
Severity: High Confidence: High
Location: ~/programming/my_old_app/controllers/old_controller.py:123
122 logging.info("Issuing command: %s" % item)
123 p = subprocess.Popen(item, shell=True, stdout=subprocess.PIPE)
124 process_list.append(p)
>> Issue: Use of insecure MD2, MD4, or MD5 hash function.
Severity: Medium Confidence: High
Location: ~/programming/my_old_app/controllers/old_controller.py:230
229 def get_hex_digest(f, blocksize):
230 checksum = hashlib.md5()
231 while 1:
In this case, we have three results. One of them is low severity with a high confidence. In other words, Bandit’s certain we’re doing something questionable but it’s not a big deal.
The second is more serious with a high severity and high confidence.
Using shell=True
with subprocess can have
serious security implications.
If we were to find this in our code, it would be a good idea to change it so it
no longer uses shell=True
or at least sanitize our input before sending
it off to the shell.
The last case is medium severity with a high confidence, so somewhere in the
middle. For this application, we made a conscious choice to use MD5s and we’re
not using it for something that needs to be secure (like password hashing), so
that’s working as expected. We might consider adding a # nosec
annotation
at the end of the line so it ignores this on future runs.
Field Results
I wanted to know how we were doing as a community. In general, are our open
source projects secure and well-written? Are we accidentally performing gem install hairball
each time we
install a community package? The nature of open source, of course, is that
something can change from insecure to secure with a single pull request. I
checked a few of the trending projects on Github to see what we could find and
to see if I could help address any security problems. Here’s what I found:
Note: For the protection of the projects, I will change identities and sample code.
Issue: Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Severity: Low Confidence: High Location: monitoring/outputs/progress_bars.py:72 70 @percent.setter 71 def percent(self, value): 72 assert value >= 0 73 assert value <= 100 74 self.__percent = value
Issue: Requests call with verify=False disabling SSL certificate checks, security issue. Severity: High Confidence: High Location: encrypter.py:206 205 try: 206 response = requests.get(uri, verify=False) 207 except requests.exceptions.RequestException as error: 208 logger.error(“Unable to reach %s: %s”, uri, error)
Issue: Use of insecure and deprecated function (mktemp). Severity: Medium Confidence: High Location: decrypter.py:291 291 listpath = tempfile.mktemp(“.tmp”, “LIST”) 292 293 with open(listpath, “rb”) as originalfile:
https://security.openstack.org/guidelines/dg_using-temporary-files-securely.html
Issue: Pickle library appears to be in use, possible security issue. Severity: Medium Confidence: High Location: serializers/pickle.py:43 42 def loads(self, value): 43 return pickle.loads(force_bytes(value))
Issue: Audit url open for permitted schemes. Allowing use of file:/ or custom schemes is often unexpected. Severity: Medium Confidence: High
165 try: 166 conn = urllib2.urlopen(request, timeout=item.timeout_seconds)
Issue: Deserialization with the marshal module is possibly dangerous. Argument/s: Name(id=‘dump’, ctx=Load()) Severity: Medium Confidence: High Location: grab/spider/cache_backend/postgresql.py:92 88 89 def unpack_database_value(self, val): 91 dump = zlib.decompress(val) 92 return marshal.loads(dump)
Issue: subprocess call with shell=True identified, security issue. Severity: High Confidence: High Location: ./src/CryptographicConnection.py:90 87 # Create ECC privatekey 88 proc = subprocess.Popen( 89 “%s ecparam -name prime256v1 -genkey -out %s/key-ecc.pem” % (self.openssl_bin, config.data_dir), 90 shell=True, 91 )
Issue: Use of unsafe yaml load. Allows instantiation of arbitrary objects. Consider yaml.safe_load(). Severity: Medium Confidence: High Location: ./parser/parser.py:447 446 yamlFile = open(yamlPath) 447 regexes = yaml.load(yamlFile) 448 yamlFile.close()
Issue: Using xml.etree.ElementTree.fromstring to parse untrusted XML data is known to be vulnerable to XML attacks. Replace xml.etree.ElementTree.fromstring with it’s defusedxml equivilent function. Severity: Medium Confidence: High Location: ./output/formatters/xml.py:46 45 try: 46 root = ElementTree.fromstring(response_body.encode(‘utf8’)) 47 except ElementTree.ParseError:
https://pypi.python.org/pypi/defusedxml https://docs.python.org/3/library/xml.html#xml-vulnerabilities
Include Bandit in your CI Pipeline
One of the great things about running discrete programs is that you can run them whenever you want and it won’t slow you down. The obvious downside to that is that you have to remember to run them. I know if I only ran flake8 when I explicitly ran the program as opposed to every save in Vim, I would never bother with the output. The key, I think, is to hold ourselves accountable.
If you’re using something like TravisCI or another CI server, consider setting Bandit as one of the steps to a successful build. I’d recommend setting the threshold at a severity of 3, to start, and perhaps gradually ratcheting it up. I’m still getting a feel for the project myself so I’m not sure how accurate all the mid-level warnings are. If you accidentally check in risky code, it’s better to know about it early rather than after someone has hacked you.
Extending Bandit
Bandit’s also extensible, so if you find that there’s a vulnerability that’s not adequately covered already in the project, you can easily add another one.