Jay Harris is Cpt. LoadTest

a .net developers blog on improving user experience of humans and coders
Home | About | Speaking | Contact | Archives | RSS
 
Filed under: ASP.Net | Events | Speaking

Next month, I will be speaking at CodeStock, a developer conference in Knoxville, Tennessee, held June 26-27. We will be discussing the ASP.NET Page Life Cycle, to help get over the fears and troubles with validation, event handing, data binding, and the conflicts between page load and page initialization.

Dev Basics: The ASP.NET Page Life Cycle

Jay Harris / Session Level: 100
When a request occurs for an ASP.NET page, the response is processed through a series of events before being sent to the client browser. These events, known as the Page Life Cycle, are a complicated headache when used improperly, manifesting as odd exceptions, incorrect data, performance issues, and general confusion. It seems simple when reading yet-another-book-on-ASP.NET, but never when applied in the real world. In this session, we decompose this mess, and turn the Life Cycle into an effective and productive tool. No ASP.NET MVC, no Dynamic Data, no MonoRail, no technologies of tomorrow, just the basics of ASP.NET, using the tools we have available in the office, today.

It's a long drive from Michigan to Knoxville, but the conference is worth the trip (the first of two Tennessee conferences I will be attending this year). A few other local speakers will be making the trip to Knoxville, as well. Check out the full session list for more information, and while you are at it, register for the event if you haven't already done so; the cost is only $25 if you sign up before the end of May. I was there last year for the first CodeStock, and I had a great time; I'm excited about this years event, not only because I am speaking, but to see what other new things that people are talking about, catch up with friends, and to meet new people in the community.

I hope to see you there.

Technorati Tags: ,,
Monday, May 18, 2009 9:27:01 AM (Eastern Daylight Time, UTC-04:00)  #    Comments [0] - Trackback

Filed under: Blogging | SEO

You may have heard of Robots.txt. Or, you may have seen requests for /Robots.txt in your web traffic logs, and if the file doesn't exist, a related HTTP 404. But what is this Robot file, and what does it do?

Introduction to Robots.txt

When on a web server, Robots.txt is a file that directs Robots (a.k.a. Spiders or Web Crawlers) on which files and directories to ignore when indexing a site. The file is located on the root directory of the domain, and is typically used to hide areas of a site from search engine indexing, such as to keep a page off of Google's radar (such as my DasBlog login page) or if a page or image is not relevant to the traditional content of a site (maybe a mockup page for a CSS demo contains content about puppies, and you don't want to mislead potential audience). Robots request this file prior to indexing your site, and its absence indicates that the robot is free to index the entire domain. Also, note that each sub-domain uses a unique Robots.txt. When a spider is indexing msdn.microsoft.com, it won't look for the file on www.microsoft.com; MSDN will need its own copy of Robots.txt.

How do I make a Robots.txt?

Robots.txt is a simple text file. You can create it in Notepad, Word, Emacs, DOS Edit, or your favorite text editor. Also, the file belongs in the root of the domain on your web server.

Allow all robots to access everything:

The most basic file will be to authorize all robots to index the entire site. The asterisk [*] for User Agent indicates that the rule applies to all robots, and by leaving the value of Disallow blank rather than including a path, it effectively disallows nothing and allows everything.

# Allow all robots to access everything
User-agent: *
Disallow:

Block all robots from accessing anything:

Conversely, with only one more character, we can invert the entire file and block everything. By setting Disallow to a root slash, every file and directory stemming from the root (in other words, the entire site) will be blocked from robot indexing.

# Block all robots from accessing anything
User-agent: *
Disallow: /

Allow all robots to index everything except scripts, logs, images, and that CSS demo on Puppies:

Disallow is a partial-match string; setting Disallow to "image" would match both /images/ and /imageHtmlTagDemo.html. Disallow can also be included multiple times with different values to disallow a robot from multiple files and directories.

# Block all robots from accessing scripts, logs,
#    images, and that CSS demo on Puppies
User-agent: *
Disallow: /images/
Disallow: /logs/
Disallow: /scripts/
Disallow: /demos/cssDemo/puppies.html

Block all robots from accessing anything, except Google, which is only blocked from images:

Just as a browser has a user agent, so does a robot. For example, "Googlebot/2.1 (http://www.google.com/bot.html)", is one of the user agents for Google's indexer. Like Disallow, the User-agent value in Robots.txt is a partial-match string, so simply setting the value to "Googlebot" is sufficient for a match. Also, the User-agent and Disallow entries cascade, with the most specific User Agent setting is the one that is recognized.

# Block all robots from accessing anything,
#    except Google, which is only blocked from images
User-agent: *
Disallow: /
User-agent: Googlebot
Disallow: /images/

Shortcomings of Robots.txt

Similar to the Code of the Order of the Brethren, Robots.txt "is more what you'd call 'guidelines' than actual rules." Robots.txt is not a standardized protocol, nor is it a requirement. Only the "honorable" robots such as the Google or Yahoo search spiders adhere to the file's instructions; other less-honorable bots, such as a spam spider searching for email addresses, largely ignore the file.

Also, do not use the file for access control. Robots.txt is just a suggestion for search indexing, and will by no means block requests to a disallowed directory of file. These disallowed URLs are still freely available to anyone on the web. Additionally, the contents of this file can be used to against you, as it the items you place in it may indicate areas of the site that are intended to be secret or private; this information could be used to prioritize candidates for a malicious attack with disallowed pages being the first places to target.

Finally, this file must be located in the root of the domain: www.mydomain.com/robots.txt. If your site is in a sub-folder from the domain, such as www.mydomain.com/~username/, the file must still be on the root of the domain, and you may need to speak with your webmaster to get your modifications added to the file.

Other Resources:

Technorati Tags: ,
Friday, May 15, 2009 9:31:37 AM (Eastern Daylight Time, UTC-04:00)  #    Comments [1] - Trackback

Filed under: Events

The event was about giving back to the community. A few weekends ago, April 24-26, 2009, the Impression 5 Science Center held the first ever Lansing Give Camp. The Lansing, Michigan event was a weekend of coding for charities, where nearly 50 area developers and over 10 volunteers gathered to donate their time and complete projects for 13 charities.

The event, which primarily took place in one large room on the first floor of Impression 5, was full of excitement and emotion. Sponsors stepped up to offer additional assistance at the last minute, all to really make the event a success. TechSmith, DevExpress, the MSU University Club, and even Impression 5 all stepped up during the final week to sponsor a meal. The remainder of the meals were covered by collaboration between Microsoft, Wing Zone, Dominos Pizza, Guido's Pizza, Panera Bread, and Dunkin Donuts. Jennifer Middlin of TechSmith and Camron Gnass of Vision Creative also covered our late-night snacks, which included Tacos and "Insomnia Cookies." Nom, nom, nom.

The biggest drama of the weekend had to be Mother Nature's visit on Saturday afternoon. A band of severe Thunderstorms rolled through Lansing on Saturday, knocking out power to the entire facility. We didn't lose any work, since everyone's laptop battery kicked in as soon as the lights went dark, but the loss of power did kill all of the wireless access points, and with it all connectivity to the source control server and to web hosting facilities. However, within minutes, Erik Larson (Director of Impression 5) was on the phone with Eric Hart (Director of the Lansing Center), and the Lansing Center responded heroically by providing us with a temporary home with power and internet access until power was restored at Impression 5. Between three teams shipping of to local coffee houses, and the rest all taking the trip across the street to the Lansing Center, everyone was able to continue working on their projects with minimal delay. I extend a huge "Thank you" to the Lansing Center for helping us get out of a jam that could have been a major detriment to the success of our weekend.

However, it was the closing ceremony at Lansing Give Camp that stole the show. There were many emotion-filled faces throughout the staff and crowd as each project conducted a presentation of their output, demoing their wares, and each charity saw dreams achieved and went home with a year of free hosting from LiquidWeb and an "everything you need to maintain your site" bag of software and books from Microsoft. Each of the attendees even went home with one or two prizes, which included books, hardware, and software from Microsoft, books from TechSmith, and software from DevExpress, Telligent, and Telleric.

It was a great event. The charities were happy. The developers were happy. It was all a huge success. And I can't wait until next year.

Lansing Give Camp in News and Blogs:

Technorati Tags: ,,
Thursday, May 7, 2009 2:55:31 PM (Eastern Daylight Time, UTC-04:00)  #    Comments [1] - Trackback