tempalias.com – rewrites
This is yet another installment in my series of posts about building a web service in node.js. The previous post is here.
Between the last post and current trunk of tempalias, there lie two substantial rewrites of core components of the service. One thing is that I completely misused Object.create() which takes an object to be the prototype of the object you are creating. I was of the wrong opinion that it works like Crockford's object.create() which is creating a clone of the object you are passing.
Also, I learned that only Function objects actually have a prototype.
Not knowing these two things made it impossible to actually deserialize the JSON representation of an alias that was previously stored in redis. This lead to the first rewrite - this time of lib/tempalias.js. Now aliases work more like standard JS objects and require to be instantiated using the new operator, on the plus side though, they work as expected now.
Speaking of serialization. I learned that in V8 (and Safari)
isNan(Date.parse( (new Date()).toJSON() )) === true |
which, according to the ES5 spec is a bug. The spec states that Date.parse() should be able to parse a string created by Date.doISOStirng() which is what is used by toJSON.
This ended up with me doing an ugly hack (string replacement) and reporting a bug in Chrome (where the bug happens too).
Anyhow. Friday and Saturday I took off the project, but today I was on it again. This time, I was looking into serving static content. This is how we are going to serve the web site after all.
Express does provide a Static plugin, but it's fairly limited in that it doesn't do any client side caching which, even though Node.js is crazy fast, seems imperative to me. Also while allowing you to configure the file system path it should serve static content from, it insists on the static content's URL being /public/whatever, where I would much rather have kept the URL-Space together.
I tried to add If-Modified-Since-support to express' static plugin, but I hit some strange interraction in how express handles the HTTP request that caused some connections to never close - not what I want.
After two hours of investigating, I was looking at a different solution, which leads us to rewrite two:
tempalias trunk now doesn't depend on express any more. Instead, it serves the web service part of the URL space manually and for all the static requests, it uses node-paperboy. paperboy doesn't try to convert node into Rails and it provides nothing but a simple static file handler for your web server which also works completely inside node's standard method for handling web requests.
I prefer this solution by much because express was doing too much in some cases and too little in others: Express tries to somewhat imitate rails or any other web framework in that it not only provides request routing but also template rendering (in HAML and friends). It also abstracts away node's HTTP server module and it does so badly as eveidenced by this strange connection not-quite-ending problem.
On the other hand, it doesn't provide any help if you want to write something that doesn't return text/html.
Personally, if I'm doing a RESTful service anyways, I see no point in doing any server-side HTML generation. I'd much rather write a service that exposes an API at some URL endpoints and then also a static page that uses JavaScript / AJAX to consume said API. This is where express provides next to no help at all.
So if the question is whether to have a huge dependency which fails at some key points and doesn't provide any help with other key points or to have a smaller dependency that handles the stuff I'm not interested in, but otherwise doesn't interfer, I'd much prefer that solution to the first one.
This is why I went with this second rewrite.
Because I was already using a clean MVC separation (the "view" being the JSON I emit in the API - there's no view in the traditional sense yet), the rewrite was quite hassle-free and basically nothing but syntax work.
After completing that, I felt like removing the known issues from my blog post where I was writing about persistence: Alias generation is now race-free and alias length is stored in redis too. The architecture can still be improved in that I'm currently doing two requests to Redis per ALIAS I'm creating (SETNX and SET). By moving stuff around a little bit, I can get away with just the SETNX.
On the other hand, let me show you this picture here:
Considering that the current solution is already creating 1546 aliases per second at a concurrency of 100 requests, I can probably get away without changing the alias creation code any more.
And in case you ask: The static content is served with 3000 requests per second - again with a concurrency of 100.
Node is fast.
Really.
Tomorrow: Philip learns CSS - I'm already dreading this final step to enlightenment: Creating the HTML/CSS front-end UI according to the awesome design provided by Richard.
No. It’s not «just» strings
On Hacker News, I came across this rant about strings in Ruby 1.9 where a developer was complaining about the new string handling in Ruby. Now, I'm no Ruby developer by even a long shot, but I am really interested in strings and string encoding which is why I posted the following comment which I reprint here as it's too big to just be a comment:
Rants about strings and character sets that contain words of the following spirit are usually neither correct nor worth of any further thought:
It's a +String+ for crying out loud! What other language requires you to understand this
level of complexity just to work with strings?!
Clearly the author lives in his ivory tower of English language environments where he is able to use the word "just" right next to "strings" and he probably also can say that he "switched to UTF-8" without actually really having done so because the parts of UTF-8 he uses work exactly the same as the ASCII he used before.
But the rest of the world works differently.
Data can appear in all kinds of encodings and can be required to be in different other kinds of encodings. Some of those can be converted into each other, others can't.
Some Japanese encodings (Ruby's creator is Japanese) can't be converted to a unicode representation for example.
Nowadays, as a programming language, you have three options of handling strings:
1) pretend they are bytes.
This is what older languages have done and what Ruby 1.8 does. This of course means that your application has to keep track of encodings. Basically for every string you keep in your application, you need to also keep track what it is encoded in. When concatenating a string of encoding a to another string you already have that is in encoding b, you must do the conversion manually.
Additionally, because strings are bytes and the programming language doesn't care about encoding, you basically can't use any of the built-in string handling routines because they assume each byte representing one character.
Of course, if you are one of these lucky english UTF-8 users, getting data in ASCII and english text in UTF-8, you can easily "switch" your application to UTF-8 by still pretending strings to be bytes because, well, they are. For all intents and purposes, your UTF-8 is just ASCII called UTF-8.
This is what the author of the linked post wanted.
2) use an internal unicode representation
This is what Python 3 does and what I feel to be a very elegant solution if it works for you: A String is just a collection of Unicode code points. Strings don't worry about encoding. String operations don't worry about it. Only I/O worries about encoding. So whenever you get data from the outside, you need to know what encoding it is in and then you decode it to convert it to a string. Conversely, whenever you want to actually output one of these strings, you need to know in what encoding you need the data and then encode that sequence of Unicode code points to any of these encodings.
You will never be able to convert a bunch of bytes into a string or vice versa without going through some explicit encoding/decoding.
This of course has some overhead associated with it, as you always have to do the encoding and because operations on that internal collection of unicode code points might be slower than the simple array-of-byte-based approach, especially if you are using some kind of variable-length encoding (which you probably are to save memory).
Interestingly, whenever you receive data in an encoding that cannot be represented with Unicode code points and whenever you need to send out data in that encoding, then, you are screwed.
This is a defficiency in the Unicode standard. Unicode was specifically made so that it can be used to represent every encoding, but it turns out that it can't correctly represent some Japanese encodings.
3) The third option is to store an encoding with each string and expose both the strings contents and the encoding to your users
This is what Ruby 1.9 does. It combines methods 1 and 2: It allows you to chose whatever internal encoding you need, it allows you to convert from one encoding to the other and it removes the need to externally keep book of every strings encoding because it does that for you. It also makes sure that you don't intermix encodings, but I'm getting ahead of myself.
You can still use the languages string library functions because they are aware of the encoding and usually do the right thing (minus, of course, bugs)
As this method is independent of the (broken?) Unicode standard, you would never get into the situation where just reading data in some encoding makes you unable to write the same data back in the same encoding as in this case, you would just create a string using this problematic encoding and do your stuff on that.
Nothing prevents the author of the linked post to use Ruby 1.9's facility to do exactly what Python 3 does (of course, again, ignoring the Unicode issue) by internally keeping all strings in, say, UTF-16 (you can't keep strings in "Unicode" - Unicode is no encoding - but that's for another post). You would transcode all incoming and outgoing data to and from that encoding. You would do all string operations on that application-internal representation.
A language throwing an exception when you concatenate a Latin 1-String to a UTF-8 string is a good thing! You see: Once that concatenation happened by accident, it's really hard to detect and fix.
At least it's fixable though because not every Latin1-String is also a UTF-8 string. But if it so happens that you concatenate, say Latin1 and Latin8 by accident, then you are really screwed and there's no way to find out where Latin1 ends and Latin8 begins as every valid Latin 1 string is also a valid Latin 8 string. Both are arrays of bytes with values between 0 and 255 (minus some holes).
In todays small world, you want that exception to be thrown.
In conclusion, what I find really amazing about this complicated problem of character encoding is the fact that nobody feels it's complicated because it usually just works - especially method 1 described above that has constantly been used in years past and also is very convenient to work with.
Also, it still works.
Until your application leaves your country and gets used in countries where people don't speak ASCII (or Latin1). Then all these interesting problems arise.
Until then, you are annoyed by every of the methods I described but method 1.
Then, you will understand what great service Python 3 has done for you and you'll switch to Python 3 which has very clear rules and seems to work for you.
And then you'll have to deal with the japanese encoding problem and you'll have to use binary bytes all over the place and have to stop using strings altogether because just reading input data destroys it.
And then you might finally see the light and begin to care for the seemingly complicated method 3.
Google Buzz, Android and Google Apps Accounts
I was looking at the Google Android Maps Application that is now providing integrated Google Buzz support, showing buzzes directly on the map and allowing you to buzz (around where I live and work, there has been a tremendous uptake of Google Buzz which makes this really compelling).
However, there's a little peculiarity about the Android maps application: If your main Google Account you configured (that's the first one you configure) on the phone is a Google Apps account, Maps will use that for buzz-support (apparently, there's already some kind of infrastructure for inter-company Buzzing in place). This means that you would only see buzzes from other people in your domain and, because there's no official support for this out there, only if they are also using an Android phone.
"Mittelpraktisch" as I would say in German.
The obvious workaround is to configure your private gmail account to be your primary account (this is only possible by factory-resetting your device by the way), but this has some disadvantages, mainly the fact that the calendar on the Android phones only supports syncing with the primary account and as it happens, usually it's the work-calendar (the Apps one) you want synchronized; not the private one (that lingers unused in my case).
To work around this issue, share your work calendar with your private Google account.
Unfortunately, I couldn't do that as I'm posting this, because the default in the domain configuration is to not allow this. Thankfully, I'm that domain's administrator, so I could change it (small company. remember.), but it seems to take a while to propagate into the calendar account.
I'll post more as my investigation turns out more, though it is my gut feeling that this mess will solve itself as Google fixes their Maps application to not use that phantom corporate buzz account.
How we use git
the following article was a comment I made on Hacker News, but as it's quite big and as I want to keep my stuff at a central place, I'm hereby reposting it and adding a bit of formating and shameless self-promotion (i.e. links):
My company is working on a - by now - quite large web application. Initially (2004), I began with CVS and then moved to SVN and in the second half of last year, to git (after a one-year period of personal use of git-svn).
We deploy the application for our customers - sometimes to our own servers (both self-hosted and in the cloud) and sometimes to their machines.
Until middle year, as a consequence of SVN's really crappy handling of branches (it can branch, but it fails at merging), we did very incremental development, adding features on customer requests and bugfixes as needed, often times uploading specific fixes to different sites, committing them to trunk, but rarely ever updating existing applications to trunk to keep them stable.
Huge mess.
With the switch to git, we also initiated a real release management, doing one feature release every six months and keeping the released versions on strict maintenance (for all intents and purposes - the web application is highly customizable and we do make exceptions in the customized parts as to react to immediate feature-wishes of clients).
What we are doing git-wise is the reverse of what the article shows: Bug-fixes are (usually) done on the release-branches, while all feature development (except of these customizations) is done on the main branch (we just use the git default name "master").
We branch off of master when another release date nears and then tag a specific revision of that branch as the "official" release.
There is a central gitosis repository which contains what is the "official" repository, but every one of us (4 people working on this - so we're small compared to other projects I guess) has their own gitorious clone which we heavily use for code-sharing and code review ("hey - look at this feature I've done here: Pull branch foobar from my gitorious repo to see...").
With this strict policy of (for all intents and purposes) "fixes only" and especially "no schema changes", we can even auto-update customer installations to the head of their respective release-branches which keeps their installations bug-free. This is a huge advantage over the mess we had before.
Now. As master develops and bug-fixes usually happen on the branch(es), how do we integrate them back into the mainline?
This is where the concept of the "Friday merge" comes in.
On Friday, my coworker or I usually merge all changes in the release-branches upwards until they reach master. Because it's only a week worth of code, conflicts rarely happen and if they do, we remember what the issue was.
If we do a commit on a branch that doesn't make sense on master because master has sufficiently changed or a better fix for the problem is in master, then we mark these with [DONTMERGE] in the commit message and revert them as part of the merge commit.
On the other hand, in case we come across a bug during development on master and we see how it would affect production systems badly (like a security flaw - not that they happen often) and if we have already devised a simple fix that is save to apply to the branch(es), we fix those on master and then cherry-pick them on the branches.
This concept of course heavily depends upon clean patches, which is another feature git excels at: Using features like interactive rebase and interactive add, we can actually create commits that
- Either do whitespace or functional changes. Never both.
- Only touch the lines absolutely necessary for any specific feature or bug
- Do one thing and only one.
- Contain a very detailed commit message explaining exactly what the change encompasses.
This on the other hand, allows me to create extremely clean (and exhaustive) change logs and NEWS file entries.
Now some of these policies about commits were a bit painful to actually make everyone adhere to, but over time, I was able to convince everybody of the huge advantage clean commits provide even though it may take some time to get them into shape (also, you gain that time back once you have to do some blame-ing or other history digging).
Using branches with only bug-fixes and auto-deploying them, we can increase the quality of customer installations and using the concept of a "Friday merge", we make sure all bug-fixes end up in the development tree without each developer having to spend an awful long time to manually merge or without ending up in merge-hell where branches and master have diverged too much.
The addition of gitorious for easy exchange of half-baked features to make it easier to talk about code before it gets "official" helped to increase the code quality further.
git was a tremendous help with this and I would never in my life want to go back to the dark days.
I hope this additional insight might be helpful for somebody still thinking that SVN is probably enough.
Twisted Tornado
Lately, the net is all busy talking about the new web server released by FriendFeed last week and how their server basically does the same thing as the Twisted framework that was around so much longer. One blog entry ends with
Why Facebook/Friendfeed decided to create a new web server is completely beyond us.
Well. Let me add my two cents. Not from a Python perspective (I'm quite the Python newbie, only having completed one bigger project so far), but from a software development perspective. I feel qualified to add the cents because I've been there and done that.
When you start any project, you will be on the lookout for a framework or solution to base your work on. Often times, you already have some kind of idea of how you want to proceed and what the different requirements of your solution will be.
Of course, you'll be comparing existing requirements against the solutions around, but chances are that none of the existing solutions will match your requirements exactly, so you will be faced with changing them to match.
This involves not only the changes themselves but also other considerations:
- is it even possible to change an existing solution to match your needs?
- if the existing solution is an open source project, is there a chance of your changes being accepted upstream (this is not a given, by the way).
- if not, are you willing to back- and forward-port your changes as new upstream versions get released? Or are you willing to stick with the version for eternity, manually back-porting security-issues?
and most importantly
- what takes more time: Writing a tailor-made solution from scratch or learning how the most-matching solutions ticks to make it do what you want?
There is a very strong perception around, that too many features mean bloat and that a simpler solution always trumps the complex one.
Have a look at articles like «Clojure 1, PHP 0» which compares a home-grown, tailor-made solution in one language to a complete framework in another and it seems to favor the tailor-made solution because it was more performant and felt much easier to maintain.
The truth is, you can't have it both ways:
Either you are willing to live with «bloat» and customize an existing solution, adding some features and not using others, or you are unwilling to accept any bloat and you will do a tailor-made solution that may be lacking in features, may reimplement other features of existing solutions, but will contain exactly the features you want. Thus it will not be «bloated».
FriendFeed decided to go the tailor-made route but instead of many other projects each day who go the tailor made route (take Django's reimplementations of many existing Python technologies like templating and ORM as another example) and keep using that internally, they actually went public.
Not with the intention to bad-mouth Twisted (though it kinda sounded that way due to bad choice of words), but with the intention of telling us: «Hey - here's the tailor-made implementation which we used to solve our problem - maybe it is or parts of it are useful to you, so go ahead and have a look».
Instead of complaining that reimplementation and a bit of NIH was going on, the community could embrace the offering and try to pick the interesting parts they see fitting for their implementation(s).
This kind of reinventing the wheel is a standard process that is going on all the time, both in the Free Software world as in the commercial software world. There's no reason to be concerned or alarmed. Instead we should be thankful for the groups that actually manage to put their code out for us to see - in so many cases, we never get a chance to see it and thus lose a chance at making our solutions better.
Snow Leopard and PHP
Earlier versions of Mac OS X always had pretty outdated versions of PHP in their default installation, so what you usually did was to go to entropy.ch and fetch the packages provided there.
Now, after updating to Snow Leopard you'll notice that the entropy configuration has been removed and once you add it back in, you'll see Apache segfaulting and some missing symbol errors.
Entropy has not updated the packages to snow leopard yet, so you could have a look at PHP that came with stock snow leopard: This time it's even bleeding edge: Snow Leopard comes with PHP 5.3.0.
Unfortunately though, some vital extensions are missing, most notably for me, the PostgeSQL extension.
This time around though, Snow Leopard comes with a functioning PHP development toolset, so there's nothing stopping you to build it yourself, so here's how to get the official PostgreSQL extension working on Snow Leopard's stock php:
- Make sure that you have installed the current Xcode Tools. You'll need a working compiler for this.
- Make sure that you have installed PostgreSQL and know where it is on your machine. In my case, I've used the One-click installer from EnterpriseDB (which persisted the update to 10.6).
- Now that Snow Leopard uses a full 64bit userspace, we'll have to make sure that the PostgreSQL client library is available as a 64 bit binary - or even better, as an universal binary.Unfortunately, that's not the case with the one-click installer, so we'll have to fix that first:
- Download the sources of the PostgreSQL version you have installed from postgresql.org
- Open a terminal and use the following commands:
% tar xjf postgresql-[version].tar.bz2 % cd postgresql-[version] % CFLAGS="-arch i386 -arch x86_64" ./configure --prefix=/usr/local/mypostgres % make
make will fail sooner or later because you the postgres build scripts can't handle building an universal binary server, but the compile will progress enough for us to now build libpq. Let's do this:
% make -C src/interfaces % sudo make -C src/interfaces install % make -C src/include % sudo make -C src/include install % make -C src/bin % sudo make -C src/bin install
- Download the php 5.3.0 source code from their website. I used the bzipped version.
- Open your Terminal and cd to the location of the download. Then use the following commands:
% tar -xjf php-5.3.0.tar.bz2 % cd php-5.3.0/ext/pgsql % phpize % ./configure --with-pgsql=/usr/local/mypostgres % make -j8 # in case of one of these nice 8 core macs :p % sudo make install % cd /etc % cp php.ini-default php.ini
- Now edit your new php.ini and add the line
extension=pgsql.so
And that's it. Restart Apache (using apachectl or the System Preferences) and you'll have PostgreSQL support.
All in all this is a tedious process and it's the price us early adopters have to pay constantly.
If you want an honest recommendation on how to run PHP with PostgreSQL support on Snow Leopard, I'd say: Don't. Wait for the various 3rd party packages to get updated.
OpenStreetMap
The last episode of FLOSS Weekly consisted of an interview with Steve Coast from OpenStreetMap. I knew about the project, but I was of the impression that it was in its infancy both content-wise and from a technical perspective.
During the interview I learned that it's surprisingly complete (unless, of course, you need a map of Canada it seems) and highly advanced from a technical point of view.
But what's really interesting is the fact how terribly easy it is to contribute. For smaller edits, you just click the edit-Link and use the Flash editor to paint a road or give it a name. If you need or want to do more, then there's a really easy to use Java based editor:
First you drag a rectangle onto a pre-rendered version of the map which will cause the server to send you the vector information consisting of that part and then you can edit whatever you want.
If you have them, you can import traces of a GPS logger to help you add roads and paths and when you are finished, you press a button and the changes get uploaded and will be visible to the public a few minutes later (though one modification I made took about an hour to arrive on the web).
When the same nodes where updated in the meantime, a really nice conflict resolution assistant will help you to resolve the conflicts.
For me personally, this has the potential to become my new after-work time sink as it combines quite many passions of mine:
- The GPS tracking, importing and painting of maps is pure technology fun.
- Actually being outside to generate the traces is healthy and also a lot of fun
- Maps also are a passion of mine. I love to look at maps and I love to compare them to my mental image of the places they are showing.
And besides all that, Open Street Map is complete enough to be of real use. For biking or hiking it even trumps Google Maps by much.
Still, at least near where I live, there are many small issues that can easily be fixed.
As the different editors are really easy to use, fixing these issues is a lot of fun and I'm totally seeing myself cleaning out all small mistakes I come across or even adding stuff that's missing. After all, this also provides me with a very good reason to visit the places where I grew up to complete some parts.
The whole concept behind being able to update a map by just a couple of mouse clicks is very compelling too as it finally gives us the potential to have really accurate maps in a very timely fashion. For example: Last October, one of the roads near my house closed and just recently the tracks of the Forchbahn were moved a bit.
Just today I added these changes to OpenStreetMap and now OSM is the only publically available map that correctly shows the traffic situation. And all that with 15 minutes of easy but interesting work.
For those interested, my Open Street Map user profile is, of course, pilif.
JavaScript and Applet interaction
As I said earlier this month: While Java applets are dead for games and animations and whatever else they were used back in the nineties, they still have their use when you have to access the local machine from your web application in some way.
There are other possibilities of course, but they all are limited:
- Flash loads quickly and is available in most browsers, but you can only access the hardware Adobe has created an API for. That's upload of files the user has to manually select, webcams and microphones.
- ActiveX doesn't work in browsers, but only in IE.
- .NET dito.
- Silverlight is neither commonly installed on your users machines, nor does it provide the native hardware access.
So if you need to, say, access a bar code scanner. Or access a specific file on the users computer - maybe stored in a place that is inconvenient for the user to get to (%Localappdata% for example is hidden in explorer). In this case, a signed Java applet is the only way to go.
You might tell me that a website has no business accessing that kind of data and generally, I would agree, but what if your requirements are to read data from a bar code scanner without altering the target machine at all and without requiring the user to perform any steps but to plug the scanner and click a button.
But Java applets have that certain 1996 look to them, so even if you access the data somehow, the applet still feels foreign to your cool Web 2.0 application: It doesn't quite fit the tight coupling between browser and server that AJAX gets us and even if you use Swing, the GUI will never look as good (and customized) as something you could do in HTML and CSS.
But did you know that Java Applets are fully scriptable?
Per default, any JavaScript function on a page can call any public method of any applet on the site. So let's say your applet implements
public String sayHello(String name){ return "Hello "+name; } |
Then you can use JavaScript to call that method (using jQuery here):
$('#some-div').html( $('#id_of_the_applet').get(0).sayHello( $('#some-form-field').val()) ); |
If you do that, you have to remember though that any applet method called this way will run inside the sandbox regardless if the applet is signed or not.
So how do you access the hardware then?
Simple: Tell the JRE that you are sure (you are. aren't you?) that it's ok for a script to call a certain method. To do that, you use AccessController.doPrivileged(). So if for example, you want to check if some specific file is on the users machine. Let's further assume that you have a singleton RuntimeSettings that provides a method to check the existence of the file and then return its name, you could do something like this:
public String getInterfaceDirectory(){ return (String) AccessController.doPrivileged( new PrivilegedAction() { public Object run() { return RuntimeSettings.getInstance().getInterfaceDirectory(); } } ); } |
Now it's safe to call this method from JavaScript despite the fact that RuntimeSettings.getInterfaceDirectory() directly accesses the underlying system. Whatever is in PrivilegedAction.run() will have full hardware access (provided the applet in question is signed and the user has given permission).
Just keep one thing in mind: Your applet is fully scriptable and if you are not very careful where that Script comes from, your applet may be abused and thus the security of the client browser might be at risk.
Keeping this in mind, try to:
- Make these elevated methods do one and only one thing.
- Keep the interface between the page and the applet as simple as possible.
- In elevated methods, do not call into javascript (see below) and certainly do not eval() any code coming from the outside.
- Make sure that your pages are sufficiently secured against XSS: Don't allow any user generated content to reach the page unescaped.
The explicit and cumbersome declaration of elevated actions was put in place to make sure that the developer keeps the associated security risk in mind. So be a good developer and do so.
Using this technology, you can even pass around Java objects from the Applet to the page.
Also, if you need your applet to call into the page, you can do that too, of course, but you'll need a bit of additional work.
- You need to import JSObject from netscape.javascript (yes - that's how it's called. It works in all browsers though), so to compile the applet, you'll have to add plugin.jar (or netscape.jar - depending on the version of the JRE) from somewhere below your JRE/JDK installation to the build classpath. On a Mac, you'll find it below /System/Library/Frameworks/JavaVM.framework/Versions/<your version>/Home/lib.
- You need to tell the Java plugin that you want the applet to be able to call into the page. Use the mayscript attribute of the java applet for that (interestingly, it's just mayscript - without value, thus making your nice XHTML page invalid the moment you add it - mayscript="true" or the correct mayscript="mayscript" don't work consistently on all browsers).
- In your applet, call the static JSObject.getWindow() and pass it a reference to your applet to acquire a reference to the current pages window-object.
- On that reference you can call eval() or getMember() or just call() to call into the JavaScript on the page.
This tool set allows you to add the applet to the page with 1 pixel size in diameter placed somewhere way out of the viewport distance and with visibility: hidden, while writing the actual GUI code in HTML and CSS, using normal JS/AJAX calls to communicate with the server.
If you need access to specific system components, this (together with JNA and applet-launcher) is the way to go, IMHO as it solves the anachronism that is Java GUIs in applets.
There is still the long launch time of the JRE, but that's getting better and better with every JRE release.
I was having so much fun last week discovering all that stuff.
This month’s find: jna and applet-launcher
Way way back, I was talking about java applets and native libraries and the things you need to consider when writing applets that need access to native libraries (mostly for hardware access). And let's be honest - considering how far HTML and JavaScript have come, native hardware access is probably the only thing you still needs applets for.
Java is slow and bloated and users generally don't seem to like it very much, but the moment you need access to specific hardware - or even just to specific files on the users filesystem, Java becomes an interesting option as it is the only technology readily available on multiple platforms and browsers.
Flash only works for hardware Adobe has put an API in (cameras and microphones) and doesn't allow access to arbitrary files. .NET doesn't work on browsers (it works on IE, but the solution at hand should work on browsers too) and ActiveX is generally horrible, doesn't work in browsers and additionally only works under windows (.NET works in theory on Unixes and Macs as well).
Which leaves us with Java.
Because applets are scriptable, you get away with hiding the awful user interface that is Swing (or, god forbid, AWT) and writing a nice integrated GUI using web technologies.
But there's still the issue with native libraries.
First, your applet needs to be signed - no way around that. Then, you need to manually transfer all the native libraries and extension libraries. Also, you'll need to put them in certain predefined places - some of which require administration privileges to be written into.
And don't get me started about JNI. Contrary to .NET, you can't just call into native libraries. You'll have to write your own glue layer between the native OS and the JRE. That glue layer is platform specific of course, so you better have your C compiler ready - and the plattforms you intend to run on, of course.
So even if Java is the only way, it still sucks.
Complex deployment, administrative privileges and antiquated glue layers. Is this what you would want to work with?
Fortunately, I've just discovered two real pearls completely solving the two problems leaving me with the hassle that is Java itself, but it's always nice to keep some practice in multiple programming languages, as long as it doesn't involve C *shudder*.
The first component I'm going to talk about is JNA (Java Native Access) which is for Java what P/Invoke is for .NET: A way for directly calling into the native API from your Java code. No JNI and thus no custom glue code and C compiler needed. Translating the native calls and structures into what JNA wants still isn't as convenient as P/Invoke, but it sure as hell beats JNI.
In my case, I needed to get find the directory corresponding to CSIDL_LOCAL_APPDATA when running under Windows. While I could have hacked something together, the only really reliable way of getting the correct path is to query the Windows API, for which JNA proved to be the perfect fit.
JNA of course comes with its own glue layer (available in precompiled form for more plattforms than I would ever want to support in the first place), so this leads us directly to the second issue: Native libraries and applets don't go very well together.
This is where applet-launcher comes into play. Actually, applet-launcher's functionality is even built into the JRE itself - provided you target JRE 1.6 Update 10 and later, which isn't realistic in most cases (just today I was handling a case where an applet had to work with JRE 1.3 which was superseded in 2002), so for now, applet-launcher which works with JRE 1.4.2 and later is probably the way to go.
The idea is that you embed the applet-launcher applet instead of the applet you want to embed in the first place. The launcher will download a JNLP file from the server, download and extract external JNI glue libraries and finally load your applet.
When compared with the native 1.6 method, this has the problem that the library which uses the JNI glue has to have some special hooks in place, but it works like a charm and fixes all the issues I've previously had with native libraries in applets.
These two components renewed my interest in Java as a glue layer between the webbrowser where your application logic resides and the hardware the user is depending upon. While earlier methods kind of worked but were either hacky or a real pain to implement, this is as clean as it gets and works like a charm.
And next time we'll learn about scripting Java applets.
digg bar controversy
Update: I've actually written this post yesterday and scheduled it for posting today. In the mean time, digg has found an even better solution and only shows their bar for logged in users. Still - a solution like the one provided here would allow for the link to go to the right location regardless of the state of the digg bar settings.
Recently, digg.com added a controversial feature, the digg bar, which basically frames every posted link in a little IFRAME.
Rightfully so, webmasters were concerned about this and quite quickly, we had the usual religious war going on between the people finding the bar quite useful and the webmasters hating it for lost page rank, even worse recognition of their site and presumed affiliation with digg.
Ideas crept up over the weekend, but turned out not to be so terribly good.
Basically it all boils down to digg.com screwing up on this, IMHO.
I know that they let you turn off that dreaded digg bar, but all the links on their page still point to their own short url. Only then is the decision made whether to show the bar or not.
This means that all links on digg currently just point to digg itself, not awarding any linked page with anything but the traffic which they don't necessarily want. Digg-traffic isn't worth much in terms of returning users. You get dugg, you melt your servers, you return back to be unknown.
So you would probably appreciate the higher page rank you get from being linked at by digg as that leads to increased search engine traffic which generally is worth much more.
The solution on diggs part could be simple: Keep the original site url in the href of their links, but use some JS-magic to still open the digg bar. That way they still get to keep their foot in the users path away from the site, but search engines will now do the right thing and follow the links to their actual target, thus giving the webmasters their page rank back.
How to do this?
Here's a few lines of jQuery to automatically make links formated in the form
<div id="link_container"> <a id="xxbc34fb" href="http://example.com/articles/cool_article">Cool Article</a> <a id="xxbc38fc" href="http://example.com/articles/cool_article2">Cool Article 2</a> <a id="xxbc39fk" href="http://example.com/articles/cool_article3">Cool Article 3</a></div> |
be opened via the digg bar while still working correctly for search engines (assuming that the link's ID is the digg shorturl):
$(function(){ $('div#link_container a').click(function(){ $(this).attr('href') = 'http://digg.com/' + this.id; }); }); |
piece of cacke.
No further changes needed and all the web masters will be so much happier while digg gets to keep all the advantages (and it may actually help digg to increase their pagerank as I could imagine that a site with a lot of links pointing to different places could rank higher than one without any external links).
Webmasters then still could do their usual parent.location.href trickery to get out of the digg bar if they want to, but they could also retain their page rank.
No need to add further complexity to the webs standards because one site decides not to play well.
