Jay Harris is Cpt. LoadTest

a .net developers blog on improving user experience of humans and coders
Home | About | Speaking | Contact | Archives | RSS
 
Filed under: Blogging | dasControls | JavaScript | SEO

If you have read my post on Misconceptions on JavaScript Plugins and SEO, you know that search engines don't do JavaScript. Though these plugins and libraries (such as one for pulling your latest Twitter Updates) are nice for adding dynamic content for your users, they are just end-user flare and add nothing to your SEO rankings. They also put an unnecessary tax on your users, as each client browser is responsible for independently retrieving the external content; the time for your page to render is extended by a few seconds as the client must first download the JS library then make the JSON/AJAX request for your content.

In response to this, I have created dasControls, a library of custom macros for dasBlog (the blogging engine that powers www.cptloadtest.com). I have started with content that is driven by custom JavaScript libraries and convert the content and data retrieval into server-side controls. For now, dasControls contains only a Twitter Status macro, but I intend to add more controls in the coming months.

dasControls [Build 1.0.0.0] : Download | Project Page

dasControls TwitterStatus Macro

The TwitterStatus macro uses server-side retrieval of your Twitter data, eliminating all client-side JavaScript calls for your tweets. By placing the Twitter request on the server, the data is also available to any search engines that index your page. Additionally, data is cached on the server, and new updates are retrieved based on the polling interval you specify. When using real-time client-side JavaScript calls, there is a 2-5 second delay for your end-users while the data is retrieved from Twitter; by caching the data on the local server, this delay is eliminated, and the content for each user is delivered from the local cache, lightening the load for the end-user while avoiding an undue burden for high-traffic sites.

Macro Name: TwitterStatus
Macro Syntax: <% TwitterStatus("user name"[, number of tweets[, polling interval]])|dasControls %>

  • User Name : String. Your Twitter handle.
  • Number of Tweets : Integer. The number of tweets to retrieve and display. [default: 10]
  • Polling Interval : Integer. The number of minutes between each Twitter retrieval. [default: 5]

Relevant CSS:

  • TwitterStatusItem : CSS class given to each Tweet, rendered as a DIV.
  • TwitterStatusTimestamp : CSS class given to each Tweet's timestamp ("32 minutes ago"), rendered as an inline SPAN within each Tweet element.

Using the Macro within a dasBlog Template

This macro is for use in the dasBlog HomeTemplate. The macro works just like any out-of-the box macro, except that you must also include the alias specified within dasControls entry the web.config (the value of the "macro" attribute). Your twitter handle is required, though you can also optionally include the number of Tweets to pull from Twitter (default: 10) and the number of minutes between each Twitter data request (default: 5). Because everything happens on the server, there is no need to include any of the Twitter JSON JavaScript libraries or HTML markup.

<% TwitterStatus("jayharris", 6, 5)|dasControls %>

Installation and Setup of dasControls

Download dasControls, extract the assembly into your dasBlog 'bin' directory.

dasControls [Build 1.0.0.0] : Download | Project Page

Enable Custom Macros within your dasBlog installation, and add the Twitter macro to your list of Custom Macros.
First, ensure that the <newtelligence.DasBlog.Macros> section exists within your web.config:

<newtelligence.DasBlog.Macros>
  <!-- Other Macro Libraries -->
</newtelligence.DasBlog.Macros>

Second, ensure that the Macros Configuration Section is defined in to your web.config <configSections>:

<configSections>
  <!—Other Configuration Sections -->
  <section requirePermission="false" name="newtelligence.DasBlog.Macros"
    type="newtelligence.DasBlog.Web.Core.MacroSectionHandler,
      newtelligence.DasBlog.Web.Core" />
</configSections>

Third, add the dasControls library entry to the dasBlog Macros section:

<newtelligence.DasBlog.Macros>
  <add macro="dasControls"
    type="HarrisDesigns.Controls.dasBlogControls.Macros,
      HarrisDesigns.Controls.dasBlogControls"/>
</newtelligence.DasBlog.Macros>

Roadmap for dasControls

In the upcoming weeks and months, I plan on adding additional macros to the dasControls library, including Delicious, Google Reader's Shared Items, and Facebook. If you're interested in any others, or have any ideas, please let me know.

Wednesday, 30 September 2009 22:33:55 (Eastern Daylight Time, UTC-04:00)  #    Comments [1] - Trackback

Filed under: Blogging | JavaScript | SEO

Search Engine Optimization is high on the radar, right now. Whether it be the quest for the first Coupon site in Bing, the highest Cosmetics site on Google, or the top-ranked "Jay Harris" on every search engine, the war is waged daily throughout the internet. For companies, it's the next sale. For people, it's the next job. Dollars are on the line in a never-ending battle for supremacy.

One of the contributing factors in your Search Engine Ranking is Content. Fresh, new content brings more search engine crawls. More crawls contributes to higher rankings. Search engines like sites that are constantly providing new content; it lets the engine know that the site is not dead or abandoned. And though this new content idea works out well for the New York Times and CNN, not everyone has a team of staff writers who are paid to constantly produce new content. So we shortcut. We don't so much have to have new content as long as we make Google think we have new content. There are hundreds if not thousands of JavaScript plugins out there to provide fresh content to our readers, ranging from Picasa photos, to Twitter updates, to AdWords, to Microsoft Gamercard tags. But I have to let you in on a little secret:

JavaScript Plugins do nothing for SEO.
Nothing.
Search engine spiders don't do JavaScript.

"This must be a lie. When I look at my site, I see my new photos, or my new tweets, or my new Achievement Points; why don't the spiders see it, too?" Well, it's true. Google Spiders, and most other Search Engine Spiders, don't do JavaScript, which is why JS provides no SEO contribution; spiders do not index what they do not see. A look through your traffic monitor, like Google Analytics, will often show a disparity between logged traffic and what is actually accounted for in Web Server logs. Analytics, a JavaScript-based traffic monitor, only logs about 40% of the total traffic to this site (excluding traffic to the RSS feed), which means that the other 60% of my visitors have JavaScript disabled. A JavaScript Disabled on 60% of all browsers seems like a ridiculously high percentage unless you consider that Spiders and Bots do not execute JavaScript.

Just like Google doesn't see the pretty layout from your stylesheet, Google also doesn't see the dynamic content from your JavaScript. Pulling down HTML, (since it is all just text, anyway) is easy; there's not even a lot of overhead associated with parsing that HTML. But add in some JavaScript, and suddenly there's a lot more effort involved in crawling your page, especially since there is a lot of bad JavaScript out there. So search engines just check what has been written into your HTML. They read the the URL, the keywords and META description, but only the content as rendered by the server. JavaScript is not touched, and JavaScript-based content is not indexed.

So how do you get around this? How do you get this SEO boost, since JavaScript isn't an available option?

Use plug-ins and utilities that pull your dynamic data server-side, rather than client-side. Create a custom WebControl that will download and parse your latest Twitter updates. Create a dasBlog macro to create your Microsoft Gamertag. By putting this responsibility on the server, not only will you make life easier on your end user (one less JavaScript library to download), but you also make this new content available to indexing engines, which can only help your Google Juice.

Update:

I've been working on a set of macros for dasBlog to start pulling my dynamic content retrievals to the server. Keep an eye out over the next couple of days for the release of my first macro, a Twitter Status dasBlog macro that will replace the need for the Twitter JS libraries on your site.

Technorati Tags: ,,
Monday, 31 August 2009 08:47:29 (Eastern Daylight Time, UTC-04:00)  #    Comments [0] - Trackback

Filed under: Blogging | SEO

You may have heard of Robots.txt. Or, you may have seen requests for /Robots.txt in your web traffic logs, and if the file doesn't exist, a related HTTP 404. But what is this Robot file, and what does it do?

Introduction to Robots.txt

When on a web server, Robots.txt is a file that directs Robots (a.k.a. Spiders or Web Crawlers) on which files and directories to ignore when indexing a site. The file is located on the root directory of the domain, and is typically used to hide areas of a site from search engine indexing, such as to keep a page off of Google's radar (such as my DasBlog login page) or if a page or image is not relevant to the traditional content of a site (maybe a mockup page for a CSS demo contains content about puppies, and you don't want to mislead potential audience). Robots request this file prior to indexing your site, and its absence indicates that the robot is free to index the entire domain. Also, note that each sub-domain uses a unique Robots.txt. When a spider is indexing msdn.microsoft.com, it won't look for the file on www.microsoft.com; MSDN will need its own copy of Robots.txt.

How do I make a Robots.txt?

Robots.txt is a simple text file. You can create it in Notepad, Word, Emacs, DOS Edit, or your favorite text editor. Also, the file belongs in the root of the domain on your web server.

Allow all robots to access everything:

The most basic file will be to authorize all robots to index the entire site. The asterisk [*] for User Agent indicates that the rule applies to all robots, and by leaving the value of Disallow blank rather than including a path, it effectively disallows nothing and allows everything.

# Allow all robots to access everything
User-agent: *
Disallow:

Block all robots from accessing anything:

Conversely, with only one more character, we can invert the entire file and block everything. By setting Disallow to a root slash, every file and directory stemming from the root (in other words, the entire site) will be blocked from robot indexing.

# Block all robots from accessing anything
User-agent: *
Disallow: /

Allow all robots to index everything except scripts, logs, images, and that CSS demo on Puppies:

Disallow is a partial-match string; setting Disallow to "image" would match both /images/ and /imageHtmlTagDemo.html. Disallow can also be included multiple times with different values to disallow a robot from multiple files and directories.

# Block all robots from accessing scripts, logs,
#    images, and that CSS demo on Puppies
User-agent: *
Disallow: /images/
Disallow: /logs/
Disallow: /scripts/
Disallow: /demos/cssDemo/puppies.html

Block all robots from accessing anything, except Google, which is only blocked from images:

Just as a browser has a user agent, so does a robot. For example, "Googlebot/2.1 (http://www.google.com/bot.html)", is one of the user agents for Google's indexer. Like Disallow, the User-agent value in Robots.txt is a partial-match string, so simply setting the value to "Googlebot" is sufficient for a match. Also, the User-agent and Disallow entries cascade, with the most specific User Agent setting is the one that is recognized.

# Block all robots from accessing anything,
#    except Google, which is only blocked from images
User-agent: *
Disallow: /
User-agent: Googlebot
Disallow: /images/

Shortcomings of Robots.txt

Similar to the Code of the Order of the Brethren, Robots.txt "is more what you'd call 'guidelines' than actual rules." Robots.txt is not a standardized protocol, nor is it a requirement. Only the "honorable" robots such as the Google or Yahoo search spiders adhere to the file's instructions; other less-honorable bots, such as a spam spider searching for email addresses, largely ignore the file.

Also, do not use the file for access control. Robots.txt is just a suggestion for search indexing, and will by no means block requests to a disallowed directory of file. These disallowed URLs are still freely available to anyone on the web. Additionally, the contents of this file can be used to against you, as it the items you place in it may indicate areas of the site that are intended to be secret or private; this information could be used to prioritize candidates for a malicious attack with disallowed pages being the first places to target.

Finally, this file must be located in the root of the domain: www.mydomain.com/robots.txt. If your site is in a sub-folder from the domain, such as www.mydomain.com/~username/, the file must still be on the root of the domain, and you may need to speak with your webmaster to get your modifications added to the file.

Other Resources:

Technorati Tags: ,
Friday, 15 May 2009 09:31:37 (Eastern Daylight Time, UTC-04:00)  #    Comments [1] - Trackback

Filed under: Blogging

FeedBurner used to allow adding DotNetKicks FeedFlare to your feeds. Even today, the FeedFlare catalog lists "Kick It" using DNK's FeedFlareUnit file. Unfortunately, when adding this file to FeedFlare using the link given in the catalog, the unfortunate user receives only a JavaScript alert of "We could not find a valid FeedFlare file at that location" instead of an enhanced feed. Why? No XML Declaration is contained within DNK's XML file.

Example XML Declaration

<?xml version="1.0" encoding="utf-8"?>

By adding only an XML Declaration to the top of the file, FeedBurner is now able to properly parse the document, and add the new flare to a feed. This would also apply to any custom FeedFlareUnit that you develop; be sure to add an XML Declaration to your XML.

Corrected File for DotNetKicks

Download: kickitflare.xml

<?xml version="1.0" encoding="utf-8"?>
<FeedFlareUnit>
  <Catalog>
    <Title>Kick it</Title>
    <Description>Kick this story on dotnetkicks.com</Description>
  </Catalog>
  <FeedFlare>
    <Text>Kick it</Text>
    <Link href="http://www.dotnetkicks.com/kick/?url=${link}" />
  </FeedFlare>
</FeedFlareUnit>

I have logged a ticket with DotNetKicks on their Google Code issue tracker. Hopefully, as opportunity allows, they can update the file on dotnetkicks.com so that the FeedFlare catalog entry will work once again. Feel free to add DotNetKicks FeedFlare to your FeedBurner feed using the file link above until they have an opportunity to address the ticket.

Adding DotNetKicks FeedFlare to your FeedBurner Feed

To add the "Kick It" FeedFlare to your existing FeedBurner feed:

  1. Copy the URL for the updated kickitflare.xml file to your clipboard:
    http://www.cptloadtest.com/content/text/kickitflare.xml
  2. Log in to FeedBurner at http://feedburner.google.com
  3. Navigate to your feed details, then to the Optimize tab, then to FeedFlare
  4. Paste the URL into the textbox, and click Add New Flare.
  5. As desired, check the appropriate checkboxes to add the flare to your RSS feed and to your site.
  6. Click the Save button at the bottom of the page.
Thursday, 02 April 2009 08:07:41 (Eastern Standard Time, UTC-05:00)  #    Comments [0] - Trackback

Filed under: Blogging | Flash | Programming

My earlier post on creating custom brushes in Google Syntax Highlighter (Extending Language Support in Google Syntax Highlighter) contains a rudimentary brush for ActionScript. The original is designed for Stone Soup; it is something to get an AS brush established, but is not meant to be exhaustive. I have revisited the brush and added some meat. The bush should now supply a more thorough coverage of the language. A download is provided below.

ActionScript Brush

dp.sh.Brushes.ActionScript = function() {
  var keywords = 'and arguments asfunction break call case catch clear ' +
    'continue default do else escape eval false finally for getProperty ' +
    'if ifFrameLoaded in instanceof loop NaN new newline not null or ' +
    'prototype return set super switch targetPath tellTarget this throw ' +
    'trace true try typeof undefined unescape var visible void while with';
  var builtin = '_currentframe _droptarget _framesloaded _global _height ' +
    '_level _name _root _rotation _target _totalframes _url _visible ' +
    '_width _x _xmouse _xscale _y _ymouse _yscale Array Boolean Button ' +
    'bytesLoaded bytesTotal Camera Color Date enabled Error focusEnabled ' +
    'Key LoadVars Math Mouse MovieClip nextFrame Number Object Selection ' +
    'Sound Stage String StyleSheet System TextFormat';
  var funcs = 'addProperty attachMovie attachVideo browse cancel ' +
    'clearInterval clone concat createEmptyMovieClip createTextField ' +
    'dispose draw duplicateMovieClip dynamic equals extends function ' +
    'getInstanceAtDepth gotoAndPlay gotoAndStop identity implements ' +
    'import interface isEmpty isFinite isNAN join length loadClip ' +
    'loadMovie loadMovieNum loadVariables loadVariablesNum merge moveTo ' +
    'on onClipEvent onDragOut onDragOver onEnterFrame onKeyDown onKeyUp ' +
    'onKillFocus onMouseDown onMouseMove onMouseUp onPress onRelease ' +
    'onReleaseOutside onRollOut onRollOver onUnload play pop prevFrame ' +
    'private public push registerClass removeMovieClip reverse rotate ' +
    'scale setEmpty setInterval setProperty shift slice sort sortOn ' +
    'splice startDrag static stopAllSounds stopDrag subtract swapDepths ' +
    'toString toString translate union unloadClip unloadMovie ' +
    'unloadMovieNum unshiftclass unwatch valueOf watch';
  var includes = '#include #initClip #endInitClip';

  this.regexList = [
    {regex: dp.sh.RegexLib.SingleLineCComments, css: 'comment' },
    {regex: dp.sh.RegexLib.MultiLineCComments, css: 'comment' },
    {regex: dp.sh.RegexLib.DoubleQuotedString, css: 'string' },
    {regex: dp.sh.RegexLib.SingleQuotedString, css: 'string' },
    {regex: new RegExp(this.GetKeywords(keywords), 'gm'), css: 'keyword' },
    {regex: new RegExp(this.GetKeywords(funcs), 'gm'), css: 'func' },
    {regex: new RegExp(this.GetKeywords(builtin), 'gm'), css: 'builtin' },
    {regex: new RegExp(this.GetKeywords(includes), 'gm'), css: 'preprocessor'}
  ];
  this.CssClass = 'dp-as';
  this.Style = '.dp-as .func { color: #000099; }' +
               '.dp-as .builtin { color: #990000; }';
}

dp.sh.Brushes.ActionScript.prototype = new dp.sh.Highlighter();
dp.sh.Brushes.ActionScript.Aliases = ['actionscript', 'as'];

Usage

Upload the Brush javascript file to your Google Syntax Highlighter Scripts directory, and load the file in unto your HTML with a <SCRIPT> tag with your other brushes.

<script language="javascript"
  src="dp.SyntaxHighlighter/Scripts/shBrushAs.js"></script>

Display syntax-highlighted ActionScript using a traditional Google Syntax Highlighter <PRE> tag, using as or actionscript as the language alias.

<pre name="code" class="as">
  // Some ActionScript Code
</pre>

Brush In Action

/*
Sample ActionScript for Demo
ActionScript Brush for Google Syntax Highlighter
*/
if (dteDate.getMonth() == intCurrMonth && intCurrMonth == intOldMonth
    && intOldYear == intCurrYear) {
  if (dteDate.getDay() == 0 and dteDate.getDate()>1) {
    intYPosition = intYPosition+20;
  }
  duplicateMovieClip ("DayContainer", "DayContainer"+intDate, intDate);
  setProperty ("DayContainer"+intDate, _y, intYPosition);
  setProperty ("DayContainer"+intDate, _x, intXPosition[dteDate.getDay()]);

  } else if (intCurrMonth == 6) {
    if (intDate == 4) {
      clrFColor = new Color("DayContainer"+intDate+".foreground");
      clrFColor.setRGB(0xFF0000);
      clrBColor = new Color("DayContainer"+intDate+".background");
      clrBColor.setRGB(0xFF0000);
    }
  } else if (intCurrMonth == 9) {
    if (intDate == 31) {
      clrFColor = new Color("DayContainer"+intDate+".foreground");
      clrFColor.setRGB(0xFF9922);
      clrBColor = new Color("DayContainer"+intDate+".background");
      clrBColor.setRGB(0xFF9922);
    }
  } else if (intCurrMonth == 10) {
    if (intDate >= 22 && intDate <= 28 && dteDate.getDay() == 4) {
      clrFColor = new Color("DayContainer"+intDate+".foreground");
      clrFColor.setRGB(0xFFCC00);
      clrBColor = new Color("DayContainer"+intDate+".background");
      clrBColor.setRGB(0xFFCC00);
    }
  set ("DayContainer"+intDate+":MyDate", new Date(dteDate.getFullYear(),
    dteDate.getMonth(), dteDate.getDate()));
  setProperty ("DayContainer"+intDate, _visible, true);
  intDate++;
  dteDate.setDate(intDate);
}

Download

Download: shBrushAs.zip
Includes:

  • Compressed shBrushAs.js for production. 
  • Uncompressed shBurshAs.js for debugging.

As always, this code is provided with no warranties or guarantees. Use at your own risk. Your mileage may vary.

 
Friday, 12 December 2008 08:25:38 (Eastern Standard Time, UTC-05:00)  #    Comments [1] - Trackback

Filed under: Blogging | Flash | JavaScript | Programming | Tools

As I discussed in an earlier post (Blog your code using Google Syntax Highlighter), Google Syntax Highlighter is a simple tool that allows bloggers to easily display code in a format that is familiar end users. The tool renders the code in a very consumable fashion that includes colored syntax highlighting, line highlighting, and line numbers. Out of the box it supports most of the common languages of today, and a few from yesterday, but some common languages are unsupported. Perl, ColdFusion, and Flash's ActionScript all are unloved by Google Syntax Highlighter, as are many others that you may want to post to your blog. For these languages, the solution is a custom brush.

Syntax Highlighting Brushes

For Google Syntax Highlighter, brushes are JavaScript files that govern the syntax highlighting process, with names following the format of shBrushLanguage.js, such as shBrushXml.js. Brushes contain information about the keywords, functions, and operators of a language, as well as the syntax for comments, strings, and other syntax characteristics. Keyword-level syntax is applied to any specific word in the language, including keywords, functions, and any word operators, such as and, or, and not. Regular expressions apply character-level syntax to code, and identifies items such as character operators, the remainder of an inline comment, or the entire contents of a comment block. Finally, aliases are defined for the brush; these are the language aliases that are used within the class attribute of the Google Syntax Highlighter <PRE> tag. With this information, the brush applies the syntax highlighting styles according to the CSS defined for each component of the language.

Breaking Down Brushes

Decomposing the SQL Brush

In JavaScript, everything is an object that can be assigned to a variable, whether its a number, string, function, or class. Brushes are each a delegate function. The variable name of the brush must match dp.sh.Brushes.SomeLanguage.

dp.sh.Brushes.Sql = function() {

Next, define the list of keywords for applying syntax highlighting. Each list is not an array, but rather a single-space delimited string of keywords that will be highlighted. Also, multiple keyword lists can exist, such as one list for function names, another for keywords, and perhaps another for types, and unique styling can be applied to each grouping (we'll get to styling a little later).

  var funcs = 'abs avg case cast coalesce convert count current_timestamp ' +
    'current_user day isnull left lower month nullif replace right ' +
    'session_user space substring sum system_user upper user year';

  var keywords = 'absolute action add after alter as asc at authorization ' +
    'begin bigint binary bit by cascade char character check checkpoint ' +
    'close collate column commit committed connect connection constraint ' +
    'contains continue create cube current current_date current_time ' +
    'cursor database date deallocate dec decimal declare default delete ' +
    'desc distinct double drop dynamic else end end-exec escape except ' +
    'exec execute false fetch first float for force foreign forward free ' +
    'from full function global goto grant group grouping having hour ' +
    'ignore index inner insensitive insert instead int integer intersect ' +
    'into is isolation key last level load local max min minute modify ' +
    'move name national nchar next no numeric of off on only open option ' +
    'order out output partial password precision prepare primary prior ' +
    'privileges procedure public read real references relative repeatable ' +
    'restrict return returns revoke rollback rollup rows rule schema ' +
    'scroll second section select sequence serializable set size smallint ' +
    'static statistics table temp temporary then time timestamp to top ' +
    'transaction translation trigger true truncate uncommitted union ' +
    'unique update values varchar varying view when where with work';

  var operators = 'all and any between cross in join like not null or ' +
    'outer some';

Following the keyword definitions is the Regular Expression pattern and Style definition object list. The list, this.regexList, is an array of pattern/style objects: {regex: regexPattern, css: classString}. The regexPattern is a JavaScript RegExp object, and defines the pattern to match in the source code; this pattern can be created using one of three options within Google Syntax Highlighter.

Predefined Patterns
Within Google Syntax Highlighter, dp.sh.RegexLib contains five predefined regular expression patterns. MultiLineCComments is used for any language that uses C-style multi-line comment blocks: /* my comment block */. SingleLineCComments is used for any language that uses C-style single line or inline comments: // my comment. SingleLinePerlComments applies for Perl-style single line comments: # my comment. DoubleQuotedString identifies any string wrapped in double-quotes and SingleQuotedString identifies strings wrapped in single-quotes. These options are used in place of creating a new instance of the RegExp object.
Keyword Patterns
Google Syntax Highlighter has a GetKeywords(string) function which will build a pattern string based on one of the brush's defined keyword strings. However, this is only the pattern string, and not the RegExp object. Pass this value into the RegExp constructor: new RegExp(this.GetKeyword(keywords), 'gmi')
Custom Pattern Definition
Create a new RegExp object using a custom pattern. For example, use new RegExp('--(.*)$', 'gm') to match all Sql comments, such as --my comment.

For these pattern/style objects, the regular expression pattern is followed by the name of the CSS class to apply to any regular expression matches. The style sheet packaged with Google Syntax Highlighter, SyntaxHighlighter.css, already defines the many CSS classes used by GSH; place the additional styles for your custom brushes within this file, in a new file, in your HTML, or defined them within the brush using JavaScript.

  this.regexList = [
    {regex: new RegExp('--(.*)$', 'gm'), css: 'comment'},
    {regex: dp.sh.RegexLib.DoubleQuotedString, css: 'string'},
    {regex: dp.sh.RegexLib.SingleQuotedString, css: 'string'},
    {regex: new RegExp(this.GetKeywords(funcs), 'gmi'), css: 'func'},
    {regex: new RegExp(this.GetKeywords(operators), 'gmi'), css: 'op'},
    {regex: new RegExp(this.GetKeywords(keywords), 'gmi'), css: 'keyword'}
  ];

The delegate definition ends with any style specifications. Apply a style sheet to the entire code block using this.CssClass. Also, as mentioned above, the brush can define custom CSS using this.Style as an alternative to placing the CSS in HTML or a CSS file. When finished, close the delegate.

  this.CssClass = 'dp-sql';
  this.Style = '.dp-sql .func { color: #ff1493; }' +
    '.dp-sql .op { color: #808080; }'; }

The final component of a Brush, set outside of your delegate, contains the prototype declaration and any aliases to apply to the Brush. Aliases consist of a string array (a real array this time, not a space-delimited string) of language aliases to use, such as ['c#','c-sharp','csharp']. Alias values must be unique across all defined brushes that you have included into your site.

dp.sh.Brushes.Sql.prototype = new dp.sh.Highlighter();
dp.sh.Brushes.Sql.Aliases = ['sql'];

Making a Custom Brush (for ActionScript)

I like rich media applications, such as those developed in Flash or Silverlight. I was surprised when I found that Google Syntax Highlighter does not ship with an ActionScript brush, and more surprised when I found out that no one has written one, yet. So, using the methods from above, I created one. This isn't meant to be an exhaustive brush, but more like Stone Soup. It's a start. Please feel free to add to it.

dp.sh.Brushes.ActionScript = function() {

  var keywords = 'and break case catch class continue default do dynamic else ' +
    'extends false finally for if implements import in interface NaN new not ' +
    'null or private public return static super switch this throw true try ' +
    'undefined var void while with';

  this.regexList = [{regex: dp.sh.RegexLib.SingleLineCComments, css: 'comment'},
    {regex: dp.sh.RegexLib.MultiLineCComments, css: 'comment'},
    {regex: dp.sh.RegexLib.DoubleQuotedString, css: 'string'},
    {regex: dp.sh.RegexLib.SingleQuotedString, css: 'string'},
    {regex: new RegExp(this.GetKeywords(keywords), 'gm'), css: 'keyword'}];

    this.CssClass = 'dp-as';
}

dp.sh.Brushes.ActionScript.prototype = new dp.sh.Highlighter();
dp.sh.Brushes.ActionScript.Aliases = ['actionscript', 'as'];
Wednesday, 10 December 2008 16:47:27 (Eastern Standard Time, UTC-05:00)  #    Comments [1] - Trackback

Filed under: ASP.Net | Blogging | Programming | SEO

Did you know that yourdomain.com and www.yourdomain.com are actually different sites? Are they both serving the same content? If so, it may be negatively impacting your search engine rankings.

Subdomains and the Synonymous 'WWW'

Sub-domains are the prefix to a domain (http://subdomain.yourdomain.com), and are treated by browsers, computers, domain name systems (DNS), search engines, and the general internet as separate, individual web sites. Google's primary web presence, http://www.google.com, is very different than Google Mail, http://mail.google.com, or Google Documents, http://docs.google.com, all because of subdomains. However, what many do not realize is that www is, itself, a subdomain.

A domain, on its own, requires no www prefix; a subdomain-less http://yourdomain.com should be sufficient for serving up a web site. And since www is a subdomain, dropping the prefix could potentially return a different response. There are some sites that will fail to return without the prefix, and some sites that fail with it, but the most common practice is that the www subdomain is synonymous for no subdomain at all.

The Synonymous WWW and SEO

The issue with having two synonymous URLs (http://yourdomain.com and http://www.yourdomain.com) is that search engines may interpret them as separate sites, even if they are serving the same content. The two addresses are technically independent and are potentially serving unique content; to a cautious search engine, even if pages appear to contain the same content, there may be something different under the covers. This means your audience's search results returns two entries for the same content. Some users will happen to click on yourdomain.com while others navigate to www.yourdomain.com, splitting your traffic, your page hits, your search ranking between two sites, unnecessarily.

HTTP Redirects will cure the issue. If you access http://google.com, your browser is instantly redirected to http://www.google.com. This is done through a HTTP 301 permanent redirect. Search Spiders recognize HTTP response codes, and understand the 301 as a "use this other URL instead" command. Many search engines, such as Google, will then update all page entries for the original URI (http://yourdomain.com) and replace it with the 301's destination URL (http://www.yourdomain.com). If there is already an entry for the destination URL, the two entries will be merged together. The search entries for yourdomain.com and www.yourdomain.com will now share traffic, share page hits, and share search ranking. Instead of having two entries on the second and third pages of search results, combining these entries may be just enough to place you on the first page of results.

In addition to combining search entries for subdomains, you can also combine root-level domains through HTTP 301. On this site, in addition to adding the www prefix if no subdomain is specified, captainloadtest.com will HTTP 301 redirect to www.cptloadtest.com.

Combining the Synonyms

We need a way to implement an HTTP 301 redirect at the domain level for all requests to a site; however, often we are using applications that may not grant us access to the source, or we don't have the access into IIS through our host to set up redirects for ourselves. URL Rewrite, Part 2 covers a great drop-in redirect module by Fritz Onion that uses a stand-alone assembly with a few additions in web.config to HTTP 301 redirect paths in your domain (it also supports HTTP 302 redirects). This module is perfect for converting a WordPress blog post URL, such as cptloadtest.com/?p=56, to a DasBlog blog post URL like cptloadtest.com/2006/05/31/VSNetMacroCollapseAll.aspx. However, to redirect domains and subdomains, the module must go a step further and redirect based on matches against the entire URL, such as directing http:// to https:// or captainloadtest.com to cptloadtest.com, which it does not support. It's time for some modifications.

private void OnBeginRequest(object src, EventArgs e) {
  HttpApplication app = src as HttpApplication;
  string reqUrl = app.Request.Url.AbsoluteUri;
  redirections redirs
    = (redirections) ConfigurationManager.GetSection("redirections");

  foreach (Add a in redirs.Adds) {
    Regex regex = new Regex(a.targetUrl, RegexOptions.IgnoreCase);
    if (regex.IsMatch(reqUrl)) {
      string targetUrl = regex.Replace(reqUrl, a.destinationUrl, 1);

      if (a.permanent) {
        app.Response.StatusCode = 301; // make a permanent redirect
        app.Response.AddHeader("Location", targetUrl);
        app.Response.End();
      }
      else
        app.Response.Redirect(targetUrl);

      break;
    }    
  }
}

By converting app.Request.RawURL to app.Request.AbsoluteUri, the regular expression will now match against the entire URL, rather than just the requested path. There is one downside to this change: the value is the actual path processed, not necessarily what was in the originally requested URL. To this effect, the value of AbsoluteUri for requesting http://www.cptloadtest.com?p=56 is actually http://www.cptloadtest.com/default.aspx?p=56; by requesting the root directory, the default page is being processed, not the directory itself, so default.aspx is added to the URL. Keep this in mind when setting up your redirection rules. Also, the original code converted the URL to lower case; with my modifications, I chose to maintain the case of the URL, since sometimes case matters, and instead ignore case in the regular expression match using RegexOptions.IgnoreCase. Finally, I made some other minor enhancements, like using the ConfigurationManager, since ConfigurationSettings is now obsolete, and reusing the matching Regex instance for replacements.

Download: RedirectModule.zip

Includes:

  • Source code for the drop-in Redirect Module
  • Sample web.config that uses the module
  • Compiled version of redirectmodule.dll

The code is based on the original Redirect Module by Fritz Onion and the Xml Serializer Section Handler by Craig Andera. As always, this code is provided with no warranties or guarantees. Use at your own risk. Your mileage may vary. Thanks to Fritz Onion for the original work, and allowing me extend his code further.

The usage is the same as Fritz Onion's original module. Drop the assembly into your site's bin, and place a few lines into the web.config. The example below contains the rules as they would apply to this site, 301 redirecting http://www.captainloadtest.com to http://www.cptloadtest.com, and adding the www subdomain to any domain requests that have no subdomain.

<?xml version="1.0"?>
<configuration>
  <configSections>
    <section name="redirections"
      type="Pluralsight.Website.XmlSerializerSectionHandler, redirectmodule" />
  </configSections>
  <!-- Redirect Rules -->
  <redirections type="Pluralsight.Website.redirections, redirectmodule">
    <!-- Domain Redirects //-->
    <add targetUrl="captainloadtest\.com/Default\.aspx"
      destinationUrl="cptloadtest.com/" permanent="true" />
    <add targetUrl="captainloadtest\.com"
      destinationUrl="cptloadtest.com" permanent="true" />

    <!-- Add 'WWW' to the domain request //-->
    <add targetUrl="://cptloadtest\.com/Default\.aspx"
      destinationUrl="://www.$1.com/" permanent="true" />
    <add targetUrl="://cptloadtest\.com"
      destinationUrl="://www.$1.com" permanent="true" />

    <!-- ...More Redirects -->
  </redirections>
  <system.web>
    <httpModules>
      <add name="RedirectModule"
        type="Pluralsight.Website.RedirectModule, redirectmodule" />
    </httpModules>
  </system.web>
</configuration>

The component is easy to use, and can redirect your site traffic to any URL you choose. Neither code changes to the application nor configuration changes to IIS are needed. By using this module to combine synonymous versions of your URLs, such as alternate domains or subdomains, you will improve your page ranking through combining duplicate search result entries. One more step towards your own search engine optimization goals.

URL Rewrite

Thursday, 04 December 2008 16:43:10 (Eastern Standard Time, UTC-05:00)  #    Comments [4] - Trackback

Filed under: Blogging | JavaScript | Programming | Reviews | Tools

Google Syntax Highlighter is a simple tool that allows bloggers to easily display code in a format that is familiar end users. The tool renders the code in a very consumable fashion that includes colored syntax highlighting, line highlighting, and line numbers.

/*
This is an example of how Google
Syntax Highlighter can highlight and display syntax
to you, the end user
*/
public void HelloWorld()
{
  // I have some comments
  Console.WriteLine("Hello, World!");
}

It is purely a client-side tool, as all of the processing is done strictly within the browser through JavaScript. There is no server-side processing. Since it is all JavaScript, you don't need special Copy/Paste plugins and macros installed to your favorite IDE or your blog authoring tool. (I am leery of random plugins and installing them into the software that I use to feed my family.) To including code in your blog post, copy your code from Visual Studio, Notepad, Flash, Firebug, or any tool that displays text, and paste it in to your post. As of v1.5.1, Google Syntax Highlighter supports C, C++, C#, CSS, Delphi, HTML, Java, JavaScript, PHP, Pascal, Python, Ruby, SQL, VB, VB.NET, XML, XSLT, and all of this is just what comes out of the box.

Setting Up Syntax Highlighter

To get Syntax Highlighter running on your blog, download the latest version of the RAR archive and extract the code. The archive contains a parent folder, dp.SyntaxHighlighter, with three child folders:

dp.SyntaxHighlighter
  \Scripts         //Production-ready (Compressed) scripts
  \Styles          //CSS
  \Uncompressed    //Human-readable (Uncompressed/Debug) scripts

Once the archive is extracted, upload dp.SyntaxHighlighter to your blog. Feel free to rename the folder if you like, though I did not. It is not necessary to upload the Uncompressed folder and its files; they are best used for debugging or for viewing the code, as the files in the Scripts folder have been compressed to reduce bandwidth by having most of their whitespace removed.

After you have uploaded the files, you will need to add script and style references to your site's HTML. This is code is not for your posts, but rather for your blog template. In DasBlog, I place this code in the <HEAD> block of my homeTemplate.blogtemplate file. Remember to change the file paths to match the path to where you uploaded the code.

<link type="text/css" rel="stylesheet"
  href="dp.SyntaxHighlighter/Styles/SyntaxHighlighter.css"></link>
<script language="javascript" src="dp.SyntaxHighlighter/Scripts/shCore.js"></script>
<script language="javascript" src="dp.SyntaxHighlighter/Scripts/shBrushCSharp.js"></script>
<script language="javascript" src="dp.SyntaxHighlighter/Scripts/shBrushXml.js"></script>
<script language="javascript">
window.onload = function () {
  dp.SyntaxHighlighter.ClipboardSwf = 'dp.SyntaxHighlighter/Scripts/clipboard.swf';
  dp.SyntaxHighlighter.HighlightAll('code');
}
</script>

To make the tool most efficient, including minimizing the code download by the client browser, highlighting is only enabled for the languages that you specify. The highlighting rules for each language is available through a file referred to as a Brush. The code sample above enables only C# and XML/HTML by including the core file, shCore.js, the C# brush, shBrushCSharp.js and the XML/HTML brush, shBrushXml.js. A unique brush file is available for each of the supported languages, and only the core file is required. These brushes are located in your Scripts directory (the human-readable version is in the Uncompressed folder). Include only the brushes that you like; if you forgot a language brush, the code will still display on your page, but as unformatted text.

<!-- Unformatted HTML Code / No Brush -->
<p id="greeting">Hi, mom & dad!</p>
<!-- Formatted HTML Code -->
<p id="greeting">Hi, mom & dad!</p>

Making Syntax Highlighter Go

Now that the application is deployed to the site, how does it get applied to a post? Paste the code into the HTML view of your post, inside of a <PRE> tag. Create a name attribute on your tag with a value of code, and a class attribute set to the language and options you are using.

<pre name="code" class="c-sharp">
  public void HelloWorld()
  {
    Console.WriteLine("Hello, World!");
  }
</pre>

One catch is the code must be first made HTML-safe. All angle-brackets, <tag>, must be converted to their HTML equivalent, &lt;tag&gt;, as well as ampersands, & to &amp;. I also find it helpful if your code-indentation uses two-spaces, rather than tabs.

<!-- Pre-converted code -->
<p>Hi, mom & dad!</p>
<!-- Converted code -->
<pre name="code" class="html">
  &lt;p&gt;Hi, mom &amp; dad!&lt;/p&gt;
</pre>

The class attribute is made up of both language and option aliases. These aliases consist of one language followed by your desired options, all in a colon delimited list.

class="language[:option[:option[:option]]]"

The value of language is any of Syntax Highlighter's defined language aliases, such as c#, csharp, or c-sharp for C#, or rb, ruby, rails, or ror for Ruby. See: full list of available languages.

Options allow for such things as turning off the plain text / copy / about controls (nocontrols), turning off the line number gutter (nogutter), or specifying the number of the first line (firstline[n]). A JavaScript code block with no controls header, and starting the line numbering at 34 would have a class attribute value of class="js:nocontrols:linenumber[34]". See: full list of available options.

Extending Syntax Highlighter

Because Google Syntax Highlighter is entirely in JavaScript, you have access to all of the code. Edit it however you like to suit your needs. Additionally, brushes are very easy to create, and include little more than a list of a highlighted language's keywords in a string and an array of language aliases. Creating a brush for ActionScript or QBasic would not take much time. Language brushes exist in the wild for Perl, DOS Batch, and ColdFusion.

In a future post I plan on discussing Brush Creation in depth through creating a brush for ActionScript.

Comparing Syntax Highlighter to Others

I am a fan of this tool, though that should be obvious considering it is what I use on this blog. I like how readable the code is, how extendable it is, and how easy it is to use. I don't like its compatibility--or lack thereof--with RSS; since all of the work is done in JavaScript, and RSS doesn't do JavaScript, there is no syntax highlighting, numbers, or options within a feed, though the code formatting is still maintained. Other tools, like the CopySourceAsHtml plugin for Visual Studio or Insert Code Snippet for Windows Live Writer convert your code into formatted HTML, where all of the syntax highlighting is applied through HTML attributes and embedded CSS. Their methods are much easier than Syntax Highlighter, since there are no stylesheets or JavaScript files to include in your HTML, and you don't have to worry about making your code HTML-safe. Also, their method works in RSS feeds. However, there isn't the same level of control. Through Syntax Highlighter's extendibility, I can theme my code views, such as if I wanted them to look like my personal Visual Studio theme. Through Syntax Highlighter, I can also make changes at a later time, and those changes will immediately reflected in all past posts, whereas making modifications to the HTML/embedded CSS pattern is much more difficult.

Final Thoughts

I like CopySourceAsHtml in Visual Studio. I used it for years on this blog. But I code in more languages than VB.Net or C#, and the plugin isn't available within the Flash or LoadRunner IDE. I was also frustrated with pasting my code in, only to find that it was too wide for my blog theme's margins, and would have to go back to Visual Studio, change my line endings, and repeat the process. I'm sticking with Google Syntax Highlighter. It works for all of my languages (as soon as I finish writing my ActionScript brush), and when my line endings are too long, I simply change my HTML. And in my HTML, my code still looks like code, rather than a mess of embedded style. I have to sacrifice RSS formatting, but as a presentation developer that is very particular about his HTML, I am glad for the customization and control.

Monday, 24 November 2008 10:41:50 (Eastern Standard Time, UTC-05:00)  #    Comments [9] - Trackback