angler-fishThe Vulnerability History Project

CVE-2018-7536
aka Irregular Expression

A vulnerability in Django was found that could allow an unauthenticated, remote attacker to cause a denial of service condition on a targeted system. The vulnerability can be found in the django.utils.html.urlize() function and is due to insufficient validation of user provided input. To exploit this vulnerability an attacker could submit a crafted input to an affected system and cause a denial of service attack on the targeted system.


In my opinion, the reason this vulnerability surfaced was because, the code became overly complex and untested. The urlize function in 2005 was 29 lines with the sole responsibility to "Convert any URLs in text into clickable links". As of 2018 when the fix was added, it was found to be at 132 lines with 4 in-line functions. Over the course of 13 years the function size and complexity has increased by 4.5 times the orginal amount. If the function had been written initially with unit tests I suspect the function would not have grown so largely out of proportion and a more modular design would have been implemented. Unfortunately due to the high complexity of this method I assume only more vulnerabilities will be found over time, especially if it keeps getting added too and is not refactored.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
CVE: CVE-2018-7536
CWE: 185
ipc:
  note: I found no inter-process communication.
  answer: false
  question: |
    Did the feature that this vulnerability affected use inter-process
    communication? IPC includes OS signals, pipes, stdin/stdout, message
    passing, and clipboard. Writing to files that another program in this
    software system reads is another form of IPC.

    Answer should be boolean.
CVSS: AV:N/AC:L/Au:N/C:N/I:N/A:P
bugs: []
i18n:
  note: It had nothing to do with internationalization, it had to do with catastrophic
    backtracking vulnerabilities in two regular expressions.
  answer: false
  question: |
    Was the feature impacted by this vulnerability about internationalization
    (i18n)? An internationalization feature is one that enables people from all
    over the world to use the system. This includes translations, locales,
    typography, unicode, or various other features.

    Answer should be boolean. Write a note about how you came to the conclusions
    you did.
repo: https://github.com/django/django/
vccs:
- note: Fixed 11911 -- Made the urlize filter smarter with closing punctuation. 1-8-2012
  commit: 15d10a5210378bba88c7dfa1f45a4d3528ddfc3f
- note: Fixed 20364 -- Changed urlize regexes to include quotation marks as punctation.
    9-23-2013
  commit: 6c06adad1dc45631c1c220d7f8fb531a9cf3ed55
- note: Fixed 26193 -- Made urlize() trim multiple trailing punctuation. 2-11-2016
  commit: dec334cb66b3ee59cb82e1bb99a584aa0b9fbbd5
- note: Fixed 7542 -- Fixed bug in urlize where it was appending 'http://' to the
    link text. Thanks for the patch and tests 6-26-2008
  commit: b7fea9409618ac23485a1048f4435f6afbc11739
- note: Fixed urlize regression with entities in query strings. 3-6-2015
  commit: ac07890f959c467b3fc9c6dd6d36aafc2eff1fcc
- note: Fixed urlize after smart_urlquote rewrite. 8-9-2014
  commit: b9d9287f59eb5c33dd8bc81179b4cf197fd54456
- note: This is the initial commit of the Urlize function from 7-12-2005 and where
    I believe the vulnerability originated.
  commit: ed114e15106192b22ebb78ef5bf5bce72b419d13
fixes:
- note: 2.0.x Fixed CVE-2018-7536 -- Fixed catastrophic backtracking in urlize and
    urlizetrunc template filters. 2-24-2018
  commit: e157315da3ae7005fa0683ffc9751dbeca7306c8
- note: 1.11.x Fixed CVE-2018-7536 -- Fixed catastrophic backtracking in urlize and
    urlizetrunc template filters. 2-24-2018
  commit: abf89d729f210c692a50e0ad3f75fb6bec6fae16
- note: 1.8.x Fixed CVE-2018-7536 -- Fixed catastrophic backtracking in urlize and
    urlizetrunc template filters. 2-24-2018
  commit: 1ca63a66ef3163149ad822701273e8a1844192c2
bounty:
  amt: 
  url: 
  announced: 
lessons:
  yagni:
    note: 
    applies: false
  question: |
    Are there any common lessons we have learned from class that apply to this
    vulnerability? In other words, could this vulnerability serve as an example
    of one of those lessons?

    Leave "applies" blank or put false if you did not see that lesson (you do
    not need to put a reason). Put "true" if you feel the lesson applies and put
    a quick explanation of how it applies.

    Don't feel the need to claim that ALL of these apply, but it's pretty likely
    that one or two of them apply.

    If you think of another lesson we covered in class that applies here, feel
    free to give it a small name and add one in the same format as these.
  serial_killer:
    note: 
    applies: false
  complex_inputs:
    note: |
      The vulnerability involved the parsing of a regular expression, which is
      a very complex language.
    applies: true
  distrust_input:
    note: Certain inputs could cause the DOS vulnerability to take place.
    applies: true
  least_privilege:
    note: 
    applies: false
  native_wrappers:
    note: 
    applies: false
  defense_in_depth:
    note: 
    applies: false
  secure_by_default:
    note: "Had the original author put some form of protection against DOS attacks
      such as timeouts for functions, \nat worse all the attacker could do is cause
      a timeout, instead of taking down an entire system.\n"
    applies: true
  environment_variables:
    note: 
    applies: false
  security_by_obscurity:
    note: 
    applies: false
  frameworks_are_optional:
    note: 
    applies: false
reviews: []
sandbox: 
upvotes: 4
CWE_note: 
mistakes:
  answer: "In my opinion, the reason this vulnerability surfaced was because, the
    code became overly complex and untested.\nThe urlize function in 2005 was 29 lines
    with the sole responsibility to \"Convert any URLs in text into clickable links\".\nAs
    of 2018 when the fix was added, it was found to be at 132 lines with 4 in-line
    functions. \nOver the course of 13 years the function size and complexity has
    increased by 4.5 times the orginal amount. \nIf the function had been written
    initially with unit tests I suspect the function would not have grown so largely\nout
    of proportion and a more modular design would have been implemented. Unfortunately
    due to the high complexity of this method\nI assume only more vulnerabilities
    will be found over time, especially if it keeps getting added too and is not refactored.\n"
  question: |
    In your opinion, after all of this research, what mistakes were made that
    led to this vulnerability? Coding mistakes? Design mistakes?
    Maintainability? Requirements? Miscommunications?

    Look at the CWE entry for this vulnerability and examine the mitigations
    they have written there. Are they doing those? Does the fix look proper?

    Use those questions to inspire your answer. Don't feel obligated to answer
    every one. Write a thoughtful entry here that those ing the software
    engineering industry would find interesting.
nickname: Irregular Expression
subsystem:
  name: utils
  answer: "The html.py file in which the vulnerability was found describes this \nfile
    to be \"HTML utilities suitable for global use\" and it is located within a utils
    folder\nleading me to the conclusion that this is best described as the html utility
    subsystem.\n"
  question: |
    What subsystems was the mistake in?

    Most systems don't have a formal list of their subsystems, but you can
    usually infer them from path names, bug report tags, or other key words
    used. A single source file is not what we mean by a subsystem. In Django,
    the "Component" field on the bug report is useful. But there may be other
    subsystems involved.

    Your subsystem name(s) should not have any dots or slashes in them. Only
    alphanumerics, whitespace, _, - and @.Feel free to add multiple using a YAML
    array.

    In the answer field, explain where you saw these words.
    In the name field, a subsystem name (or an array of names)

    e.g. clipboard, model, view, controller, mod_dav, ui, authentication
discovered:
  answer: James Davis reported the vulnerability to Django.
  contest: false
  question: |
    How was this vulnerability discovered?

    Go to the bug report and read the conversation to find out how this was
    originally found. Answer in longform below in "answer", fill in the date in
    YYYY-MM-DD, and then determine if the vulnerability was found by a Google
    employee (you can tell from their email address). If it's clear that the
    vulenrability was discovered by a contest, fill in the name there.

    The automated, contest, and developer flags can be true, false, or nil.

    If there is no evidence as to how this vulnerability was found, then please explain where you looked.
  automated: false
  developer: false
description: "A vulnerability in Django was found that could allow an unauthenticated,
  \nremote attacker to cause a denial of service condition on a targeted system.\nThe
  vulnerability can be found in the django.utils.html.urlize() function and is due
  to\ninsufficient validation of user provided input. To exploit this vulnerability
  an attacker could\nsubmit a crafted input to an affected system and cause a denial
  of service attack on the targeted system.\n"
unit_tested:
  fix: true
  code: false
  question: |
    Were automated unit tests involved in this vulnerability?
    Was the original code unit tested, or not unit tested? Did the fix involve
    improving the automated tests?

    For code: and fix: - your answer should be boolean.

    For the code_answer below, look not only at the fix but the surrounding
    code near the fix in related directories and determine if and was there were unit tests involved for this subsystem. The code

    For the fix_answer below, check if the fix for the vulnerability involves
    adding or improving an automated test to ensure this doesn't happen again.
  fix_answer: The fix did involve adding an additional test aka test_urlize.
  code_answer: There were unit tests, but not for the Urlize method.
discoverable: 
reported_date: 
specification:
  answer: false
  answer_note: There is no mention of a specification being followed for this code.
  instructions: |
    Is there mention of a violation of a specification? For example,
    an RFC specification, a protocol specification, or a requirements
    specification.

    Be sure to check all artifacts for this: bug report, security
    advisory, commit message, etc.

    The answer field should be boolean. In answer_note, please explain
    why you come to that conclusion.
announced_date: 2018-03-09T20:29Z
curation_level: 1
published_date: '2018-03-09'
CWE_instructions: |
  Please go to http://cwe.mitre.org and find the most specific, appropriate CWE
  entry that describes your vulnerability. We recommend going to
  https://cwe.mitre.org/data/definitions/699.html for the Software Development
  view of the vulnerabilities. We also recommend the tool
  http://www.cwevis.org/viz to help see how the classifications work.

  If you have anything to note about why you classified it this way, write
  something in CWE_note. This field is optional.

  Just the number here is fine. No need for name or CWE prefix. If more than one
  apply here, then choose the best one and mention the others in CWE_note.
yaml_instructions: |
  ===YAML Primer===
  This is a dictionary data structure, akin to JSON.
  Everything before a colon is a key, and the values here are usually strings
  For one-line strings, you can just use quotes after the colon
  For multi-line strings, as we do for our instructions, you put a | and then
  indent by two spaces

  For readability, we hard-wrap multi-line strings at 80 characters. This is
  not absolutely required, but appreciated.
bounty_instructions: |
  If you came across any indications that a bounty was paid out for this
  vulnerability, fill it out here. Or correct it if the information already here
  was wrong. Otherwise, leave it blank.
interesting_commits:
  commits:
  - note: |
      This commit is interesting because it is the start of the problem with this urlize function.
      In this commit it starts out innocent as a function whose purpose is to "Convert any URLs in text into clickable links.".
      After 13 years the function has grown out of proportion it has inner functions, is hard to follow, and is slow.
    commit: ed114e15106192b22ebb78ef5bf5bce72b419d13
  - note: |
      Its amazing it took over 13 years to finally create a unit test for this function.
      Had its original creator started with unit tests, this function may have stayed maintainable.
      Instead it is overcomplex and hard to follow.
    commit: 1ca63a66ef3163149ad822701273e8a1844192c2
  question: |
    Are there any interesting commits between your VCC(s) and fix(es)?

    Write a brief (under 100 words) description of why you think this commit was
    interesting in light of the lessons learned from this vulnerability. Any
    emerging themes?
curated_instructions: |
  If you are manually editing this file, then you are "curating" it.

  Set the version number that you were given in your instructions.

  This will enable additional editorial checks on this file to make sure you
  fill everything out properly. If you are a student, we cannot accept your work
  as finished unless curated is properly updated.
upvotes_instructions: |
  For the first round, ignore this upvotes number.

  For the second round of reviewing, you will be giving a certain amount of
  upvotes to each vulnerability you see. Your peers will tell you how
  interesting they think this vulnerability is, and you'll add that to the
  upvotes score on your branch.
nickname_instructions: |
  A catchy name for this vulnerability that would draw attention it. If the
  report mentions a nickname, use that. Must be under 30 characters.
  Optional.
reported_instructions: |
  What date was the vulnerability reported to the security team? Look at the
  security bulletins and bug reports. It is not necessarily the same day that the
  CVE was created.  Leave blank if no date is given.
  Please enter your date in YYYY-MM-DD format.
announced_instructions: |
  Was there a date that this vulnerability was announced to the world? You can
  find this in changelogs, blogs, bug reports, or perhaps the CVE date. A good
  source for this is Chrome's Stable Release Channel
  (https://chromereleases.googleblog.com/).
  Please enter your date in YYYY-MM-DD format.
fixes_vcc_instructions: |
  Please put the commit hash in "commit" below (see my example in
  CVE-2011-3092.yml). Fixes and VCCs follow the same format.
published_instructions: |
  Is there a published fix or patch date for this vulnerability?
  Please enter your date in YYYY-MM-DD format.
description_instructions: |
  You can get an initial description from the CVE entry on cve.mitre.org. These
  descriptions are a fine start, but they can be kind of jargony.

  Rewrite this description IN YOUR OWN WORDS. Make it interesting and easy to
  read to anyone with some programming experience. We can always pull up the NVD
  description later to get more technical.

  Try to still be specific in your description, but remove project-specific
  stuff. Remove references to versions, specific filenames, and other jargon
  that outsiders to this project would not understand. Technology like "regular
  expressions" is fine, and security phrases like "invalid write" are fine to
  keep too.

  Your target audience is people just like you before you took any course in
  security

See a mistake? Is something missing from our story? We welcome contributions! All of our work is open-source and version-controlled on GitHub. You can curate using our Curation Wizard.

Use our Curation Wizard

Or go to GitHub

  • There are no articles here... yet

Timeline

Hover over an event to see its title.
Click on the event to learn more.
Filter by event type with the buttons below.

expand_less