Cascading Stylesheet CSS code shown in Brackets text editor IDE Coffee Theme
, , ,

CSS Generated Content: Revisited

(originally posted April 2, 2007. Updated to appear at front for a current project)

Character References in CSS Generated Content Strings

With web standards compliancy in mind, a desire to program using valid markup, to use the right DTD, etc., there inevitably comes difficulty, and unexpected challenges.

Many times when i’ve used an NCR, a named entity, or maybe even an ASCII control-character, i have had some sort of problem in achieving my desired output. Such problems are common in situations where i’ve used CSS Generated content.

I welcome feedback on this issue– i would love to hear the recommendations of others who’ve suffered this problems, and who have found their own solutions.

Unicode Character Encoding Syntax: the Solution?

In my experience, the compatibility of the various languages character encoding syntax with that of CSS 2.1 is unpredictable– at least to the point at which one understands the CSS specification itself. What may work in PHP, XHTML, and other languages for generating a Newline Character may not work in CSS. The developer’s best chance of success here is to gain a thorough understanding of the recommended syntax for character encodings to use in CSS Generated Content.

The first thing the developer must realize is that, in order to insert characters– especially <CONTROL> characters, such as the NEWLINE control character, the CSS content property is required. Content is a CSS property which allows the developer to insert characters (automatically generated, or user-defined) into the document output.

Here I offer an account of my own success with outputting Encoded characters through a CSS content String by way of the CSS :before pseudo-element.

CSS Content :before and :after

NOTE: If you wish to use the CSS content property to include a new line, that is– to generate a new line within the string output for aesthetic value, I recommend reading the section titled Characters and Case (Section 4.1.3) of the W3C CSS 2.1 Recommendation or CSS 2 Specification, specifically that section which describes the use of the backslash, and escaped characters. Direct your attention to the bullet-point section which begins, โ€œ In CSS 2.1, a backslash () character indicates three types of character escapes. โ€, and moreover, includes the following important rule:

If a character in the range [0-9a-f] follows the hexadecimal number, the end of the number needs to be made clear. There are two ways to do that:

1. with a space (or other whitespace character): “26 B” (“&B”). In this case, user agents should treat a “CR/LF” pair (U+000D/U+000A) as a single whitespace character.
2. by providing exactly 6 hexadecimal digits: “00026B” (“&B”)

In fact, these two methods may be combined. Only one whitespace character is ignored after a hexadecimal escape. Note that this means that a “real” space after the escape sequence must itself either be escaped or doubled.

Update: 2008-04-05
Following the rule cited above, I speak from experience that, as stated by the W3C, exactly 6 hexadecimal digits, preceded by the backslash character for escaping, does produce the desired generated content, including the Unicode character. Note: the Hexadecimal pattern used may be, for example, in the case of a number sign, (aka. an octothorpe), to generate the character via CSS content: property, first you must reference the Hex code for that character, which is &#x23;. The hex value must be further interpreted however, for CSS. Using the 6 hex digit rule, that code would look like the following property, value pair:
content: "Step 00023 " counter(section); (that is, escape-slash, zero, zero, zero, zero, two, three)

Generated Content In Practice:

The Scenario

I needed to output the Character References instead of the characters themselves because i didn’t want to have a section output under the <pre> tag, where it wasn’t supposed to yet.

so the CSS looked like this:

 pre:before {content : "&lt;pre&gt;"}

but this did not work, because it was simply output as it went in: with the Entities shown in the output. After trying a few different sets of characters, and Character Reference types, i achieved success with the following:

pre:before {content: "3c pre3e";}

(what i’ve referenced as the “successful” pattern should appear to you as a backward slash followed by the numeral 3 and the letter “c”. Though i have been unable to find this particular Character Reference in any special Charset, i only know that it worked to produce in the output the Characters, the “less-than” and “greater-than” symbol glyphs.

**UPDATE: the final CSS declaration evolved into the following, which came most close to what i was attempting to achieve with the Generated Content:

pre:before {
   content: "viewing 3c pre3e : a";
    font-family:Arial Black, Arial Black, Gadget, sans-serif;
}

Note:the <pre> tag above shows the code for the CSS which is ultimately used to render :before, as discussed. It also illustrates the result, as the style shown in the code has been applied to the <pre> code block which renders the example. (i.e. if you see the text :before the actual code in the <pre> block, above, then try to select it. chances are, you will not be able to select the section of text rendered by CSS– a general characteristic of CSS generated text)

Still More CSS Generated Content:

If the topic of CSS Generated Content interests you, and how to properly encode characters as part of a CSS Content String (or otherwise), then I recommend you begin at FileFormat.info – a web site with perhaps the most extensive collection of Unicode and Charset information available for FREE. Fileformat.info represents, in my opinion, an incredible accomplishment in on-line reference material. I recommend you check it out, and bookmark it while you’re there. You won’t find a lot about CSS, but you will find just about everything you might need to know about how expressing text / character information in a variety of different formats– from how to generate a character in regular text (i.e. keyed-in from the standard keyboard, in some instances using a combination of keys in succession [like a special-move combo in video games]), to binary, decimal, hexadecimal, java code syntax, and more! Dig deep– there’s a lot to be found there.

What about Character Encodings and Charset?

The topic of Character Encodings and Character Set is far more complex than could possibly be explained in a closing paragraph, as this, but in order to understand it; to work in different encodings and charsets, I recommend you have the best tools available for the task. Though not specific to CSS, I believe a well equipped developer should have in his or her collection of software, either BabelPad or BabelMap– or both– the extensive Unicode-aware applications by BabelStone software. (see Babelstone Unicode in the WordPress Center .net pages, under Windows Software)

BabelStone is maintained by Andrew West PhD (Princeton University). Among his extensive research, authoritative knowledge of Chinese glyphs, and books authored, Dr. West has also contributed to the completion of the Unicode Standard itself. When it comes to Character Encodings (e.g. Numerical Character References [NCR’s], HTML Entities, Unicode), if you want to do it right, you can’t go wrong with BabelStone!

Summary

Review the references provided here, and seek out more knowledge on this subject. There’s an extensive amount of credible information out there, so go grab it, but in doing so– take notes. Summarize and share.
๐Ÿ˜‰

Whatchu do


Leave a Reply

Your email address will not be published. Required fields are marked *