angler-fishThe Vulnerability History Project

CVE-2008-0005

Upon sending HTML for web pages, the software did not specify a text encoding. Because browsers can auto-detect encoding, this meant any encode could effectively be used. This includes UTF-7, which uses + and - characters to deliminate characters with special meanings in order to construct certain other characters. < and > characters may therefore be written in UTF-7 without using those actual characters. This means anyone who knew their way around UTF-7 could create cross-site script to run arbitrary JavaScript code.


This appears to have been an oversight at the inception of specifying content types that was not addressed until someone proved it to be a vulnerability. This is true even though several other commits reworked the way content types were set. The mitigation taken is quite straighforward. The charset was explicitly specified as ISO-8859-1. This prevents any other encoding from being used, and it may then be sanitized much more effectively.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
CVE: CVE-2008-0005
CWE: 116
ipc:
  note: 
  answer: 
  question: |
    Did the feature that this vulnerability affected use inter-process
    communication? IPC includes OS signals, pipes, stdin/stdout, message
    passing, and clipboard. Writing to files that another program in this
    software system reads is another form of IPC.

    Answer should be boolean. Explain your answer
bugs: []
i18n:
  note: 
  answer: 
  instructions: |
    Was the feature impacted by this vulnerability about internationalization
    (i18n)? An internationalization feature is one that enables people from all
    over the world to use the system. This includes translations, locales,
    typography, unicode, or various other features.

    Answer should be boolean. Write a note about how you came to the conclusions
    you did.
repo: 
vccs:
- note: |-
    Introduces code for DAV module.

    Formerly 92291b5ed38235ba0667769412f86e16cc1b3076 before HTTPD rewrote Git history.
  commit: 59ad7a1b7cca4e17013bf4a0c5c220256f37472f
- note: |-
    Baseline for Apache 2.0.

    Formerly 5d855a48777529f38b148c19c021a01685677f79 before HTTPD rewrote Git history.
  commit: e3e87d34a0280b4e88c87b86b715d2c710ffb7ec
- note: |-
    Introduces code for LDAP module.

    Formerly 1a9db703a5f36b09ae97f8ff5bcd94e7740dfe49 before HTTPD rewrote Git history.
  commit: 011142a2afe78de0f27ea9de34fab8143c0d5271
- note: |-
    Add functionality to load balancer, which also made calls to set content type.

    Formerly 433584106cf110b74b74193f7bf3acebdf3503bb before HTTPD rewrote Git history.
  commit: 2d7f11c8035b26a244adc99df9d7a301f18c7107
- note: |-
    Add content type sets to Proxy module.

    Formerly c91b14ab99acb64a46de01bd88958d67ee904e0a before HTTPD rewrote Git history.
  commit: 07e7e196f4e0ab1edf3172f1a128c4253c20b504
fixes:
- note: |-
    Specifies explicit charsets for all pertinent pages.

    Formerly 843bc0eebe984216b90427c75f0d8d3af4a6b1c4 before HTTPD rewrote Git history.
  commit: b514669c7a6fac30d166fa392d7ab803fae2bca8
bounty:
  amt: 
  url: 
  announced: 
lessons:
  yagni:
    note: |
      Arbitrary encoding may have been considered more of a feature at the beginning, but
      the benefit was relatively low while the security risk was found to be high.
    applies: true
  question: |
    Are there any common lessons we have learned from class that apply to this
    vulnerability? In other words, could this vulnerability serve as an example
    of one of those lessons?

    Leave "applies" blank or put false if you did not see that lesson (you do
    not need to put a reason). Put "true" if you feel the lesson applies and put
    a quick explanation of how it applies.

    Don't feel the need to claim that ALL of these apply, but it's pretty likely
    that one or two of them apply.

    If you think of another lesson we covered in class that applies here, feel
    free to give it a small name and add one in the same format as these.
  serial_killer:
    note: 
    applies: 
  complex_inputs:
    note: Complex inputs involving UTF-7 text could be crafted to successfully pull
      off XSS.
    applies: true
  distrust_input:
    note: Input was already sanitized, but they didn't expect or prepare for UTF-7.
    applies: true
  least_privilege:
    note: 
    applies: 
  native_wrappers:
    note: 
    applies: 
  defense_in_depth:
    note: 
    applies: 
  secure_by_default:
    note: 
    applies: 
  environment_variables:
    note: 
    applies: 
  security_by_obscurity:
    note: 
    applies: 
  frameworks_are_optional:
    note: 
    applies: 
reviews: []
upvotes: 10
CWE_note: 
mistakes:
  answer: |
    This appears to have been an oversight at the inception of specifying content types that was
    not addressed until someone proved it to be a vulnerability. This is true even though several
    other commits reworked the way content types were set.

    The mitigation taken is quite straighforward. The charset was explicitly specified as ISO-8859-1.
    This prevents any other encoding from being used, and it may then be sanitized much more effectively.
  question: |
    In your opinion, after all of this research, what mistakes were made that
    led to this vulnerability? Coding mistakes? Design mistakes?
    Maintainability? Requirements? Miscommunications?

    Look at the CWE entry for this vulnerability and examine the mitigations
    they have written there. Are they doing those? Does the fix look proper?

    Use those questions to inspire your answer. Don't feel obligated to answer
    every one. Write a thoughtful entry here that those ing the software
    engineering industry would find interesting.
nickname: 
reported: 
announced: '2008-01-10'
published: 
subsystem:
  name:
  - dav
  - proxy
  - ldap
  answer: This issue spanned most modules in httpd that set content types to HTML.
  question: |
    What subsystems was the mistake in?

    Look at the path of the source code files code that were fixed to get
    directory names. Look at comments in the code. Look at the bug reports how
    the bug report was tagged.
discovered:
  date: '2007-12-15'
  answer: An exploit was provided by someone on the SecurityReason Research team.
  google: false
  contest: false
  question: |
    How was this vulnerability discovered?

    Go to the bug report and read the conversation to find out how this was
    originally found. Answer in longform below in "answer", fill in the date in
    YYYY-MM-DD, and then determine if the vulnerability was found by a Google
    employee (you can tell from their email address). If it's clear that the
    vulenrability was discovered by a contest, fill in the name there.

    The "automated" flag can be true, false, or nil.
    The "google" flag can be true, false, or nil.

    If there is no evidence as to how this vulnerability was found, then you may
    leave this part blank.
  automated: false
description: |
  Upon sending HTML for web pages, the software did not specify a text encoding.

  Because browsers can auto-detect encoding, this meant any encode could effectively
  be used. This includes UTF-7, which uses + and - characters to deliminate characters
  with special meanings in order to construct certain other characters. < and >
  characters may therefore be written in UTF-7 without using those actual characters.

  This means anyone who knew their way around UTF-7 could create cross-site script
  to run arbitrary JavaScript code.
unit_tested:
  fix: false
  code: false
  answer: |
    These modules do not appear to have unit tests.

    The fix did not introduce or modify any tests.
  question: |
    Were automated unit tests involved in this vulnerability?
    Was the original code unit tested, or not unit tested? Did the fix involve
    improving the automated tests?

    For the "code" answer below, look not only at the fix but the surrounding
    code near the fix and determine if and was there were unit tests involved
    for this module.

    For the "fix" answer below, check if the fix for the vulnerability involves
    adding or improving an automated test to ensure this doesn't happen again.
specification:
  answer: 
  answer_note: 
  instructions: |
    Is there mention of a violation of a specification? For example,
    an RFC specification, a protocol specification, or a requirements
    specification.

    Be sure to check all artifacts for this: bug report, security
    advisory, commit message, etc.

    The answer field should be boolean. In answer_note, please explain
    why you come to that conclusion.
curation_level: 1
CWE_instructions: |
  Please go to cwe.mitre.org and find the most specific, appropriate CWE entry
  that describes your vulnerability. (Tip: this may not be a good one to start
  with - spend time understanding this vulnerability before making your choice!)
autodiscoverable:
  answer: 
  answer_note: 
  instructions: |
    Is it plausible that a fully automated tool could have discovered
    this? These are tools that require little knowledge of the domain,
     e.g. automatic static analysis, compiler warnings, fuzzers.

    Examples for true answers: SQL injection, XSS, buffer overflow

    Examples for false: RFC violations, permissions issues, anything
    that requires the tool to be "aware" of the project's
    domain-specific requirements.

    The answer field should be boolean. In answer_note, please explain
    why you come to that conclusion.
yaml_instructions: 
bounty_instructions: |
  If you came across any indications that a bounty was paid out for this
  vulnerability, fill it out here. Or correct it if the information already here
  was wrong. Otherwise, leave it blank.
interesting_commits:
  commits:
  - note: |-
      The method to set the content type was created in favor of an older method.

      This would have been a great opportunity to add explicit charsets.


      Formerly 0b94edd92746a061283c6b09f9d72d5c165de76f before HTTPD rewrote Git history.
    commit: 470edb9dd87afbdb66ed0fd36bce7c97c2889086
  - note: |-
      The method to set the content type changes to a new method. Again, this would

      have been a good opportunity to specify charsets.


      Formerly bfe4fb0484196a1fec98a75574c239c5e7544225 before HTTPD rewrote Git history.
    commit: 00feac4b964bd7f9f3dd2f0fef98873df06cf868
  question: |
    Are there any interesting commits between your VCC(s) and fix(es)?

    Write a brief (under 100 words) description of why you think this commit was
    interesting in light of the lessons learned from this vulnerability. Any
    emerging themes?
curated_instructions: |
  If you are manually editing this file, then you are "curating" it. Set the
  entry below to "true" as soon as you start. This will enable additional
  integrity checks on this file to make sure you fill everything out properly.
  If you are a student, we cannot accept your work as finished unless curated is
  set to true.
upvotes_instructions: |
  For the first round, ignore this upvotes number.

  For the second round of reviewing, you will be giving a certain amount of
  upvotes to each vulnerability you see. Your peers will tell you how
  interesting they think this vulnerability is, and you'll add that to the
  upvotes score on your branch.
nickname_instructions: |
  A catchy name for this vulnerability that would draw attention it. If the
  report mentions a nickname, use that. Must be under 30 characters.
  Optional.
reported_instructions: 
announced_instructions: |
  Was there a date that this vulnerability was announced to the world? You can
  find this in changelogs, blogs, bug reports, or perhaps the CVE date. A good
  source for this is Chrome's Stable Release Channel
  (https://chromereleases.googleblog.com/).
  Please enter your date in YYYY-MM-DD format.
fixes_vcc_instructions: |
  Please put the commit hash in "commit" below (see my example in
  CVE-2011-3092.yml). Fixes and VCCs follow the same format.
published_instructions: 
description_instructions: |
  You can get an initial description from the CVE entry on cve.mitre.org. These
  descriptions are a fine start, but they can be kind of jargony.

  Rewrite this description in your own words. Make it interesting and easy to
  read to anyone with some programming experience. We can always pull up the NVD
  description later to get more technical.

  Try to still be specific in your description, but remove Chromium-specific
  stuff. Remove references to versions, specific filenames, and other jargon
  that outsiders to Chromium would not understand. Technology like "regular
  expressions" is fine, and security phrases like "invalid write" are fine to
  keep too.

See a mistake? Is something missing from our story? We welcome contributions! All of our work is open-source and version-controlled on GitHub. You can curate using our Curation Wizard.

Use our Curation Wizard

Or go to GitHub

Beware of complex inputs

Don't just think about code complexity, think about *input* complexity.

Timeline

Hover over an event to see its title.
Click on the event to learn more.
Filter by event type with the buttons below.

expand_less