Oct 29 2009

Reading querystring variables using SSI and regular expressions

Category: apacheDavide Zanotti @ 5:09 am

In these days I worked on a project into which I have to rely on SSI (apache’s Server Side Includes) in order to read and use url parameters to dynamically include certain html files with “include virtual” directive. Unfortunately the documentations available online is not exhaustive, and I had to figure out some things by myself.
Anyway, according to the docs, there are several global variables we can use in SSI, two of these are: DOCUMENT_URI and QUERY_STRING, which are the two we can use to handle the page url. The first returns the (%-decoded) URL path of the document, the second all the string starting with “?”.
So, how we can extract our desired variables from these strings, since SSI doesn’t offer method such “substring”, “split”, “indexOf” and similar? The answer is: by using Regular Expression in a tricky and ingenious way!
SSI offers a basic way to implementing decision flow (if, else, elif), the if command has an attribute expr which represents a declaration to be valuated, in this attribute is also possible to use a regex to test a given pattern. By knowing this, is possible to declare an SSI variable which represents the desired querystring parameter in the following way:

1
2
3
4
5
6
7
<!--#if expr="$QUERY_STRING = /year=([0-9]{4})/" -->
    <!--ssi-comment: year found -->
    <!--#set var="year" value="$1" -->
<!--#else -->
    <!--ssi-comment: year NOT found -->
    <!--#set var="year" value="$DATE_LOCAL" -->
<!--#endif -->

In the code above I’m looking to a querystring parameter called year which must be a 4 ({4}) digit number ([0-9]).
If the pattern tested returns true, the matched value (returned by the regex) will be assigned to the SSI variable year, otherwise the current server date year ($DATE_LOCAL) will be assigned.

Notes:
1. “ssi-comment:” is not a special syntax, but just a comment style I decided to adopt to be readable and understandable.
2. To get only the year from $DATE_LOCAL variable, you must config the format using “#config timefmt=”%Y”"

Tags: , , , , ,


Sep 10 2009

Find outermost top level XML/HTML tags with regular expressions

Category: javascriptDavide Zanotti @ 7:58 am

I’m working on a personal big project (which I’m going to release soon) and in this project I need to parse strings containing XHTML tags with the goal of extract the top level of a given tag name, ie. from:

1
2
3
4
5
6
7
8
9
<onetag id="t1">
    <onetag id="t1-1"></onetag>
    <onetag id="t1-2"></onetag>
</onetag>
<onetag id="t2"></onetag>
<onetag id="t3"></onetag>
<onetag id="t4">
    <onetag id="t4-1"></onetag>
</onetag>

I have to get 4 tags (t1, t2, t3, t4 with t1 and t4 containing their child nodes).
My regex knowledge is unfortunately very basic, so I googled for a ready to use regex, but none satisfied my need… all the examples I found didn’t handle properly nested tags… so, after some hours of testing I realized my own regex (my first real one), the result is the following:

1
var pattern = /<(onetag)[^<>]*>(<\1[^<>]*><\/\1>)*<\/\1>/gi;

In my case I’m using that pattern in Javascript, but I think it can be used with any language, because it doesn’t make use of advanced features like “atomic grouping” and these kind of “black magics”. To match the desired tag you can use it by replacing “onetag” with the tag you are looking for (even a tag with a namespace like “<foo:mytag>”).

EDIT:

The pattern above will work only if applied to a single line string (ie: var myString = “<onetag id=’t1′>…”), if you use that pattern on a “complex string” (a string containing spaces and new lines) it won’t works properly. Fortunately you can remove “bad characters” before by using a simple replace:

1
var parsedString = originalString.replace(/\s(?!\w)/gi, "").match(pattern);

\s(?!\w) will match any space and new line not followed by an alphanumeric characters (in this way spaces between tag attributes won’t be removed)

EDIT 2:

The pattern /<(onetag)[^<>]*>(<\1[^<>]*><\/\1>)*<\/\1>/gi won’t works properly in presence of several type of nested tags, ie:

1
2
3
4
5
6
7
8
9
10
11
12
13
<onetag id="t1">
    <anothertag>
         <onetag id="t1-1"></onetag>
         <onetag id="t1-2"></onetag>
    </anothertag>
</onetag>
<onetag id="t2"></onetag>
<onetag id="t3"></onetag>
<onetag id="t4">
    <anothertag>
        <onetag id="t4-1"></onetag>
    </anothertag>
</onetag>

The updated pattern is the following:

1
var newP = /<(onetag)[^<>]*>.*?(<\1[^<>]*>.*?<\/\1>)*.*?<\/\1>/gi;

I hope this will works without further modifications :P

Tags: ,


Aug 25 2009

Eclipse: convert upper case text to lower case and viceversa with a simple shortcut

Category: ideDavide Zanotti @ 8:02 am

This is just a quick post to share these 2 little shortcuts to convert text from lower case to uppercase and viceversa in Eclipse.

Lower case: CTRL+SHIFT+Y (CMD+SHIFT+Y on Mac OS X)
Upper case: CTRL+SHIFT+X (CMD+SHIFT+X on Mac OS X)

In both the combination you can select one or more character to convert.

Tags: , ,


Aug 07 2009

I released my first interesting Flex component :)

Category: flexDavide Zanotti @ 5:52 am

Today, I released my component “ImageNavigator”, you can see it here and download source code. I’m pretty satisfied about it, but it needs some little improvements and I’ve got some ideas to extend its features… so don’t consider it as the final version :P


Aug 02 2009

Resolve Flex’s error: “Type was not found or was not a compile-time constant”

Category: flexDavide Zanotti @ 3:23 am

I just faced the terrible nightmare of “Type was not found or was not a compile-time constant” error (Flex Builder) and I lost several time to figure out what the problem was.
I realized that I was using the same name for MXML Application file which was already used by a class inside one of my packages. So, by renaming the file I solved the problem, but I was not completely satisfied and I looked for a way to avoid the error mantaining the same file/class name. Initially I tried to use namespaces, but as the reference says: “Applying a namespace means placing a definition into a namespace. Definitions that can be placed into namespaces include functions, variables, and constants (you cannot place a class into a custom namespace)”. I finally solved by renaming class references inside my package with the full qualified name (from MyClass to com.mysite.foo.MyClass)

Tags: , , ,


« Previous PageNext Page »