Fri, 2008 Sep 12
Posted in Wistle
at 09:00
by jmorgan
Multi Site views and public files
Parts
1
2
3
4
Finally, I want to be able to create the views, and do so using haml, erb, etc,
and store them in Subversion, and have different views (and that means different
stylesheets, etc) for each site. That involves three major actions:
- Decide how to organize the per-site files.
- Figure out how to get those files updates.
- Tell Merb where to find the files.
My organization goes like this
/app
/sites
/SITENAME
/views
(possibly /helpers here in the future).
/public
/sites
/SITENAME
The file updates is more tricky. An easy option would be to use svn:externals.
This could be a hassle, though, if the Wistle app is hosting a lot of sites.
Instead, I'm going to update SiteSync to also update the views and public files.
This will be done by deleting the current directory (when there has been an
update) and exporting the most recent files. First thing, a few more properties
in Site:
class Site
property :views_uri, Text
property :views_revision, Integer, :default => 0
property :public_uri, Text
property :public_revision, Integer, :default => 0
# A URI based off of contents_uri to use as the base for building URI's
# for public and views
def base_uri
ary = contents_uri.split("/")
ary.pop if ary[-1].blank?
ary.pop
ary.join("/") + "/"
end
def views_uri
@views_uri || (base_uri + "app/views")
end
def public_uri
@public_uri || (base_uri + "public")
end
end
The additional methods give me some default URIs based on the contents_uri. This
is based on my preferred organization.
SiteSync is where the big updates happen. Basically, I add some methods to check
if there are updates to the views or public files. If so, the current are
deleted and an export is done. This means that the files could be inaccessible
for a few seconds (depending on connection speed and repository size). I'm also
not sure if/when reboots would be required in a production environment.
class SiteSync
def run
super
export_views
export_public
end
def export_views
export("views", File.join(Merb::root, "app", "sites", @model_row.name, "views"))
end
def export_public
export("public", File.join(Merb::root, "public", "sites", @model_row.name))
end
def export(name, export_path)
export_path = File.expand_path(export_path)
uri = @model_row.__send__("#{name}_uri")
rev = @model_row.__send__("#{name}_revision")
connect(uri)
return false if @repos.latest_revnum <= rev
updated_rev = @repos.stat(uri[(@repos.repos_root.length)..-1], @repos.latest_revnum).created_rev
return false if updated_rev <= rev
FileUtils.mkdir_p(export_path)
FileUtils.rm_rf(export_path)
@ctx.export(uri, export_path)
@model_row.update_attributes("#{name}_revision" => @repos.latest_revnum)
true
end
end
Method #export is the workhorse here, and the bulk is checking if we really need to do
any work and that the path is ready for the export. The actual @ctx.export line
is anticlimatic.
This does require some updates in Wistle::SvnSync because we may be accessing
multiple repositories within on instance. In short, #connect and #context both
need to accept a uri option rather than relying on @config.uri. Probably some
refactoring is in order (move all connection work to another class, for
example).
Telling Merb where to find the files
This turns out to be surprisingly easy, so long as the "correct" helper methods
are used. On that note, I'll look first at the public files. This requires two
override methods in GlobalHelpers.
module Merb
module GlobalHelpers
def image_tag(img, opts ={})
opts[:path] ||= "/sites/#{@site.name}/images/"
super(img, opts)
end
def asset_path(asset_type, filename, local_path = false)
path = super(asset_type, filename, local_path)
"/sites/#{@site.name}#{path}"
end
end
end
image_tag generates a :path option to the site-specific image directory, unless
:path has been set manually. It then calls super to let the original method do
the real work.
asset_path is similar but, well, backwards. This is called by js_include_tag
and css_include_tag to generate the appropriate path. I call super to let the
parent method again do the real work. Then I prepend its result with the
site-specific public path.
Another option is if I were using Lighttpd or Apache or something similar to
serve public files, I could use the web server's url rewriting capabilities.
The approach I take for the views is fun. In Application, I add this little
jewel:
class Application < Merb::Controller
before :update_template_roots
after :revert_template_roots
def update_template_roots
self.class._template_roots = [
["#{Merb.root}/app/views", :_template_location],
["#{Merb.root}/app/sites/#{@site.name}/views", :_template_location]
]
end
def revert_template_roots
self.class._template_roots = [
["#{Merb.root}/app/views", :_template_location]
]
end
end
Is that an ugly hack or what? Surely there's a better way than back and forth
modifying a class variable. Please? Well, there probably is, but I don't know
Merb's internals well enough.
The key is the class method (I believe representing a class variable),
_template_roots . If I understand it all correctly, this is used by render to
determine possible base paths and what method to use with that path. So, with
each request to render, I tack on the current site's view path as a possible
root, call super, then revert to the default. Why this back and forth?
Because one request could be directly followed by a request for a different
Site.
I half expect to be beaten in my sleep for that one. But it works.
Revision 79
Conclusion
Of course, this is just the starting point, but it's met my goals, and I hope
it's illustrated both some basics of Merb and DataMapper as well as how these
can be used to interact with data that is not stored in an relational database.
After all, great frameworks and libraries can really free us to focus on the
important bits, but they can also make it difficult to see all the
possibilities.
After that cheesy statement, here's a few pieces I'd like to expand Wistle with
in the future:
- Tags (now supported)
- Search
- Date links (i.e. /2008 gets all articles from 2008)
- RSS/Atom
- A sync action (for use by, e.g. subversion hooks; now supported)
- Pagination
- Per-site Helpers (maybe; I've debated whether there's any likely value in
this)
- Better support for STI (I've played with this a bit)
Finally, it's worth mentioning that my intent with these articles is illustrative
and/or tutorial, rather than to start a "project". That is, I hope this helps
people who are writing their own blog or similar application. However, should
you decide to use Wistle, that's great, and I'd be happy to receive bug reports,
feature requests, etc. Whether I will do anything with them probably depends on
the day.
Permalink
0 Comments
Fri, 2008 Sep 05
Posted in Wistle
at 09:00
by jmorgan
Multi Site models
Ah, point #4, multiple sites hosted on one Wistle instance. I'm not going to
create additional "library" functionality to support this. After all, this is
getting pretty application-specific. But, I am going to take advantage of the
existing Wistle library.
The key point here is going to be a Site model, that: a) Articles belong to; and
b) takes over storing per-site configuration. In essence, it will replace
Wistle::Config. The key is that "site-wide" configuration will subsitute for
model-wide configuration. So, let's start with the Site model.
class Site
include DataMapper::Resource
has n, :articles
property :id, Integer, :serial => true
property :name, String, :unique => true, :nullable => false
property :domain_regex, String
# Subversion
property :contents_uri, Text
property :contents_revision, Integer, :default => 0
property :username, String
property :password, String
property :property_prefix, String, :default => "ws:"
property :extension, String, :default => "txt"
# Content Filters
property :article_filter, String
property :comment_filter, String
# Timestamps
property :created_at, DateTime
property :updated_at, DateTime
end
Here's the properties (and has n, :articles).
Note that the properties under the "Subversion" heading
are pretty close to the instance variables of Svn::Config. Also, notice
contents_uri and contents_revision. These match with Config's uri and revision.
Why the prefix? Because I want to able to use a different uri (possibly in
another repo) for views and public files. But that is for the next section. I
could set up username, etc this same way; I won't for now, because I have no
use for doing so. If I were, however, I would probably create yet another model,
called "Config" or something that belongs to a Site, with a role property.
Like I said, it's not needed for now, so I won't bother.
The contents_* fields could create a problem though, because Wistle::SvnSync
expects different names. A simple solution is some (not-quite) aliasing:
class Site
def uri
@contents_uri
end
def revision
@contents_revision
end
def revision=(rev)
attribute_set(:contents_revision, rev)
end
def body_property
:body
end
end
The revision= is also used by Wistle::SvnSync, and body_property is another
configuration option that SvnSync expects. With body_property, there's only one
option, at least so long as I only use the one model (Article). So,
body_property always returns :body. I'll show how all this hooks into SvnSync in
a moment. Before that, though, a bit about the :domain_regex property.
Wistle is not designed to be user-friendly in the traditional sense, except
when the user is defined as me. For example, adding Sites, deleting Comments,
etc. must, at this point, be done through a console. That's great by me, but for
someone without programming experience, Wistle would probably not be a good
choice. Another example is the domain regex property. It's used by
Site.by_domain (below) to find a site based on a domain. Except, as it's name
implies, domain_regex is a regular expression. Great for me, might be less
attractive to others.
class Site
class << self
# Find a Site by domain regex, prefer longest match.
def by_domain(val)
possible = []
# Find matching Sites
Site.all.each do |s|
r = Regexp.new(s.domain_regex.to_s, true)
m = r.match(val)
if m
possible << [s, m[0].length]
end
end
# Sort for longest match.
possible.sort!{ |a, b| b[1] <=> a[1] }
possible[0] ? possible[0][0] : nil
end
end
end
I no longer need to include Wistle::Svn in the Article model, but I do need to
add in the properties that Wistle::Svn took care of.
class Article
# Subversion-specific properties
property :path, String
property :svn_created_at, DateTime
property :svn_updated_at, DateTime
property :svn_created_rev, String
property :svn_updated_rev, String
property :svn_created_by, String
property :svn_updated_by, String
end
I also update how Filters works to deal with the *_filter properties. To utilize
these properties, in Article and Comment, I change the :filter option of the
body property to set :default => :site . This tells the Filters::Resource
module to use the Site model to determine default filters. In Comment, I also
add a method #site, because Filters may try to call this method.
class Comment
def site
@article.site
end
end
Now, you may have noticed a few weird methods that didn't do much in SvnSync,
partically get and new_record. Here's where they come in. To use SvnSync
with the new Site model (instead of the Wistle::Model Model), a few things have
to change. First, Site doesn't have a config method, pointing to a
Wistle::Config object. It does, however, respond to the the same methods as a
Config object. Second, when creating or getting the content, we need to scope by
Site. What to do? Inherit Wistle::SvnSync and override a few key methods.
class SiteSync < Wistle::SvnSync
def initialize(model_row)
@model_row = model_row
@model = Article
@config = model_row
end
# Get an Article by site and path.
def get(path)
Article.first(:site_id => @model_row.id, :path => short_path(path))
end
def new_record
@model.new(:site_id => @model_row.id)
end
end
Awesome-sauce.
Now, just hook in Site to SiteSync and all the ugly work is done!
class Site
def sync
SiteSync.new(self).run
end
class << self
def sync_all
Site.all.each do |site|
site.sync if site.contents_uri
end
end
end
end
The controllers need a few updates to filter by Site (and the application view
needs one for the list of recent articles, but I'm ignoring views). Application
needs updates first:
class Application < Merb::Controller
before :sync_articles
before :choose_site
protected
def sync_articles
Site.sync_all
end
def choose_site
@site = Site.by_domain(request.host)
end
end
I change the syncarticles method to use Site.sync_all. Then, I add a
choosesite before filter to assign @site, using Site.by_domain (request.host
is the full host name including any port number).
One other bit I want to do that might as well fall in this section is folders
as categories. My approach here is definately I reflection of my personal
organizations styles; in addition, the code is probably not a good solution.
Anyway, I want each top-level folder under the articles directory to represent
a category; I want to be able to add additional subfolders without them creating
additional categories. I also prefer to use only one category per article, with
additional "categorization" through tags (which I will not be implementing in
this already way too long article).
To do so, I need to add a category property, which I'll update with a before
:save hook
class Article
property :category, String
before :save, :update_category
def update_category
if attribute_dirty?(:category) || @category.nil?
attribute_set(:category, @path.split('/')[0]) if @path
end
end
end
I then add two new methods to Site, one to get a list of categories, the second
to find published articles by category.
class Site
def categories
repository.adapter.query('SELECT category FROM articles WHERE site_id = ? group by category order by category', self.id)
end
def published_by_category(category = nil, options = {})
conditions = "datetime(published_at) <= datetime('now') "
if category
conditions << "and path like '#{category}/%' "
end
Article.all(options.merge(
:conditions => [conditions + "and site_id = ?", self.id],
:order => [:published_at.desc]))
end
end
Now is also a nice time for some routing updates, both to take advantage of
categories, and for "permalink" paths for the Articles. I'm taking advantage
of Merb's support for regular expressions in routes:
Merb::Router.prepare do |r|
r.resources :articles do | article |
article.resources :comments
end
r.match('/').to(:controller => 'articles', :action =>'index')
r.match(%r[/categories/(.*)]).to(
:controller => 'articles', :action => 'index', :category => '[1]')
r.match(%r[/(.*)]).to(
:controller => 'articles', :action => 'show', :path => '[1]')
end
The articles resource remains to support comments, although it is probably not
needed.
The last match is the "permalink" one, so that there's not "articles" or other
prefixes in permalinks; doing this obviously depends on the particular
application.
And the Articles controller gets a couple of updates to take advantage of these
routes:
class Articles < Application
# provides :xml, :yaml, :js
def index
@articles = @site.published_by_category(params[:category])
display @articles
end
def show
if params[:path]
@article = Article.first(:path => params[:path], :site_id => @site.id)
else
@article = Article.first(:id => params[:id], :site_id => @site.id)
end
raise NotFound unless @article
display @article
end
end
Revision 73
And, next, the views...
Permalink
0 Comments
Fri, 2008 Aug 29
Posted in Wistle
at 09:00
by jmorgan
Body Filters
My next big step is to filter the content, so that Article#html, for example,
is the body property filtered through Markdown. So, I created a Filters module,
in lib/filters.rb . I won't show the code here, but I am going to discuss my
approach about. Of course, plenty of other packages, such as Mephisto, have
already addressed this issue and done so well. But, a big part of this project
is for my own personal enjoyment. And I want to write random code, eh?
The crux of my approach is that all the filtering libraries I'm accustomed can
be used as such: FilterClass.new(content).tohtml. So, the Filters
module attempts to initialize an object of the specified class and
call #tohtml. If needed, the module tries to require the appropriate file or
gem.
A constant Hash is defined, with each pair in the format:
NameSpecifiedInModel => [[require_name, ClassName], [backup_require_name, BackupClassName]]
For example:
{
'Smartypants' => [['rubypants', 'RubyPants']],
'Markdown' => [['rdiscount', 'RDiscount'], ['bluecloth', 'BlueCloth']]
}
In the model, this is set up by include Filter::Resource (probably,
not the most useful name). Then, properties can be set to format with an option
:format. The syntax when defining a property is:
property :prop_name, :filter => {:to => :filtered_prop, :with => :filter_column, :default => "DefaultFilter"}
(:with and :default are optional, though at least one should be specified.)
If the properties in :to and :with have not yet been defined, they will be
defined automatically. Hence, if you want to specify any options with this, they
should be defined before the filtered property.
This is similar to Wistle::Svn in that it extends the property methods and
stores information in a class instance variable. It also adds a method
process_filters, called by a before :save hook, that updates the to
property.
So, in Article and Comment, we will now have:
property :html, Text, :lazy => false
property :body, Text,
:filter => {:to => :html, :with => :filters, :default => %w{Markdown Smartypants}}
I also update views to use #html instead of #body.
There's room for design debate here. One of the things I like about DataMapper
is that the programmer explicitly declares properties. But, here, Filters is
doing a lot behind the scenes, including possibly declaring some properties.
Still, the design "feels right" to me.
Revision 54
Permalink
0 Comments
Fri, 2008 Aug 22
Posted in Wistle
at 09:00
by jmorgan
One Site Subversion
So, we have a more or less working blog application using our friends Merb and
DataMapper. That's great and if you were looking for a Merb/DataMapper tutorial,
hopefully the first entry helped.
Still, the central goal is to store the articles
in a source-control repository. So, let's get going on that.
For now, I'm going to ignore the multi-site requirement, for two reasons: I want
to first focus on just interacting with Subversion, without extra distractions;
and I happen to know that I want to write the library that will be covered in
this section for other uses.
Before diving into the code, I want to examine three "big" design questions.
How much abstraction? I could create a library that abstracts so that it
presents a unified API for multiple SCMs. But I won't. Again, it
complicates things. Also, the "Subversion" stuff will not be accessed from
many points within the app, so I feel fairly safe with the possibility of
future "API changes" if I decide to abstract it later.
What SCM? This is an easy call for me. I'm accustomed to Subversion,
including having some experience using its SWIG bindings. I've played a
little with git, but at some point I have to cut off the "learning new things
on this one app".
How to interact with Subversion. Here's some possibilities:
Command line/backticks: I'm not entirely opposed to this, but since there
are better options, no reason to look here.
RSCM: This may no longer be true, but from
my memory of RSCM, it more or less uses the command line functions. It
offers the benefit of being abstracted, but like the command line, it
means working with a working copy. Sure, I could have one checked out
in a tmp directory, but I don't care for that idea.
post-commit-hooks: This could be pretty useful, and I could see extending
Wistle to accept, say, XML or YAML sent by such a hook (it probably
would be fairly simple). One downside is that it requires permissions to
modify the hook. I don't actually anticipate this would be a common
problem. The second downside is then I don't get to play :-( Oh, and the
third is getting pre-existing data.
CSCM:
Theoretically, along the lines of RSCM; so far, it's only for Subversion,
but it uses the SWIG bindings. Downsides are that it's not under active
development (with occassional exceptions on my personal copy, I suppose),
and that it's geared towards a different purpose.
Using Subversion SWIG bindings directly: Yay! This allows a bit more
control and focus than using one of the libraries, and since we don't
need a lot, I don't think this is reinventing the wheel. Or, maybe, I'm
just using an earlier wheel, down one level of abstraction. The big
downside to this is that installing
the SWIG binding can be a massive pain unless you have a distro that has a
nice package; it may well be impossible on Windows...
I'll throw in one more, which is using the svn or Subversion DeltaV--or
whatever the correct name is--protocols directly. Neither protocol is particularly
frightening, but that path would still be a lot of extra work for probably
minimal gain. It also has the downside that you have be running either
svnserve or an http server.
Questions answered, I'll add a few more design thoughts before diving into the
code.
One approach, one which I've already tried, is to skip the relational database
altogether. This is certainly possible, and with caching of the generated pages
would be fast enough for my purposes. However, custom text searches are a
problem, requiring loading all the current data, then performing the search in
Ruby.
Since this is not scalable, my solution is for the application to actually
retrieve data from a relational database which mirrors the current state of
repository. Therefore, the main functionality I need to add is to update the
database after any commit to the repository. Initially, I had the update
procedure run at every request but for most requests, this only checked the
current revision number. Other options would be a cron job, an "update" button
on the site, etc. My current solution is an action, "/articles/sync_all", and
a post-commit hook that wget's that page.
To derive this updating functionality, I want to include a Module in the
appropriate models. I'll call it Wistle::Svn because I can't think of a more
useful name. I'll save the file as "lib/wistle/svn.rb".
The first thing (yes, finally) I want this module to is add some properties to
any model which includes it. So, let's start with:
module Wistle
module Svn
class << self
def included(klass) # Set a few 'magic' properties
klass.property :path, String
klass.property :svn_created_at, DateTime
klass.property :svn_updated_at, DateTime
klass.property :svn_created_rev, String
klass.property :svn_updated_rev, String
klass.property :svn_created_by, String
klass.property :svn_updated_by, String
end
end
end
end
Path will store the relative path in the repository. It will also serve as a
permalink later on (Note, path, and the *_by's were added in later commits
than the others. I just went and missed them).
The others are your basic created/updated timestamps except they will be kept in
sync with the Subversion repo. This allows for having an #updated_at in the
database without interfering with the auto timestamp functionality, etc. Also,
we'll keep track of the revisions. #svn_created_rev is for information only;
#svn_updated_rev will be important to the sync method. So, every model that
includes Wistle::Svn gets these properties, stored in the relational database.
Of course, I'm now assuming that this will only be included in a class that
include DataMapper::Resource.
Next up, I want to be able to specify, in the model, which is the "body"
property; that is, what property in the relational db should store the contents
of the file in Subversion. So, I need to accept an option to the property
class method. But before I get there, this introduces a problem. How should I
store this configuration data?
If you check out ActiveRecord::Base, for example, you'll see a lot of lines
like this:
cattr_accessor :table_name_prefix, :instance_writer => false
@@table_name_prefix = ""
I'm no expert on Rails internals, but I've spent a decent amount of time going
through ActiveRecord in particular and this seems to be the preferred Rails'
method for doing class-wide configuration. cattr_accessor is a Rails addition
to Ruby (Merb has it as well). Having spent time in ActiveRecord, my first
inclination was to use this. And as a methodology, it works pretty well when
your inheriting your functionality. Class variables in an included module
doesn't work (at least not in any way I understand).
Instead, I decided to just use a configuration class. It's simpler and cleaner,
in my opinion, and doesn't have the inclusion problem mentioned above (I'll get
to how that works in a bit). So, let's start defining that class:
module Wistle
class Config
attr_accessor :body_property
def initialize
# Set defaults
@body_property = 'body'
end
end
end
All it does, for now, is define an instance variable, @body_property (the
name of the property in the database that stores the contents of the file) and
use :attr_accessor to create the getter and setter methods.
But our model needs access to the Config data. Again, I could try to make
it a class variable, but there's still the problem with class variables in
modules. Fortunately, in Ruby, everything is an Object. So, a class can have
instance variables.
module Wistle::Svn
module ClassMethods
def config
@config ||= Config.new
end
end
end
Easy enough? I also need to extend the model class with the methods in
ClassMethods when the module is included. This is a popular Rails trick. To the
Wistle::Svn.included method, add the line
klass.extend(ClassMethods). Now, if Article includes Wistle::Svn,
we can access the config via #config (in the class), and self.class.config (from
instances). And, I can always add custom methods for configuration options that
are more likely to be accessed. Now, then, I can update DataMapper's property
class method to accept an option saying that a particular property stores the
file's contents.
module Wistle::Svn
module ClassMethods
def property(name, type, options = {})
if options.delete(:body_property)
config.body_property = name.to_s
end
super(name, type, options)
end
end
end
Using this would be something like:
class Article
property :contents, :body_property => true
end
I'll look at what Wistle::Svn does with this information when I discuss syncing
the databases. Hopefully, I will get to that point eventually.
As an aside, since I don't anticipate any instance methods in the Wistle::Svn
module, I could drop the ClassMethods module and use extend instead of
include in my model. But I've chosen the include for consistency with
DataMapper.
The wistle_models table
Before I can get to syncing, the database will need to know the version of its
"working copy", as it were. Except, I suppose, for the first update. I reckon
I need another table in the database that keeps track of the current revision
for each Wistle::Svn model. So, 'lib/wistle/model.rb':
module Wistle
class Model # Table is named wistle_models.
include DataMapper::Resource
property :id, Integer, :serial => true
property :name, String
property :revision, Integer
end
end
And this file needs to be required in 'lib/wistle.rb'. Just for fun, let's run
rake dm:db:autoupgrade. Alas, no luck, the new model doesn't
migrate. There's a good reason why, none of the Wistle module is required when
running Merb (As an aside, it just seems more reasonable to me to include
Wistle::Model in the Wistle lib instead of directly in the models directory). Add
another depencency in init.db, but there's a gotcha here. This dependency should
not be declared until after use_orm :datamapper, because it depends
on DataMapper being loaded.
use_orm :datamapper
dependency 'lib/wistle.rb'
Awesome. I guess. You can run that migration now and it should work. And now
let's get our Subversion-y models talking to this model.
module Wistle::Svn
module ClassMethods
def svn_repository
return @svn_repository if @svn_repository
@svn_repository = Wistle::Model.first(:name => self.name)
@svn_repository ||= Wistle::Model.create(:name => self.name, :revision => 0)
@svn_repository.config = config
@svn_repository
end
end
end
Again, I use the Class instance variable trick. I only want to set up
@svn_repository when I have to, so if it's already available, I just return it.
Next, I try to get a row in wistle_models that is set up for the current. If no
luck there, I create such a row. Finally, I give this Model instance direct
access to the Subversion-ized Models @config. Which means one more update to
Wistle::Model: attr_accessor :config.
Before hitting the update code, I want to flesh out the Wistle::Config class.
The other three configuration elements I want are
- uri
-
The uri of the folder in the Subversion repository where the model's
contents are stored (file:///path/to/repo/path/to/folder,
svn://example.com/path/to/folder, etc.)
- username
-
The Subversion username to use, if needed.
- password
-
The Subversion password to use, if needed.
- property_prefix
-
This addresses a question I didn't ask above. How to deal with properties
other than the contents. I could, for example, start each file with a bit
of yaml or xml or what have. I'm going to store the other properties using
Subversion's property mechanism. However, I want to minimize the chance of
name conflicts, so I provide a setting for a prefix. As a default, I'll use
"ws:" (for Wistle::Svn, I guess).
- extension
-
The extension of files that will be included in the update. This is
certainly not necessary, but it works for me.
class Wistle::Config
OPTS = [:uri, :username, :password,
:body_property, :property_prefix, :extension]
attr_accessor *OPTS
def initialize
# Set defaults
@body_property = 'body'
@property_prefix = 'ws:'
@extension = 'txt'
end
end
The OPTS constant is because I'll re-use this list momentarily.
I also want to be able to set some of these settings in database.yml, if it's
available. At the end of the initialize method, I add:
if Object.const_defined?("Merb")
f = "#{Merb.root}/config/database.yml"
env = Merb.env.to_sym || :development
end
if f
config = YAML.load(IO.read(f))[env]
OPTS.each do |field|
config_field = config["svn_#{field}"] || config["svn_#{field}".to_sym]
if config_field
instance_variable_set("@#{field}", config_field)
end
end
end
Now, in database.yml, I can add :svn_username: my_login. That is,
I can prefix any of the fields defined above with 'svn_'. I'm not sure that
sentence made sense.
Revision 42
Updating
Hey, it's time for the central code, sync the database from the repository. If
you're particularly interesting in using Subversion's SWIG bindings, one of the
more interesting parts of this project might be the Wistle::Fixture library,
which I use to generate Subversion repository "test fixtures", but which I won't
cover here. Incidentally, if you are so inclined, the
test cases
included in Subversion's repository. The
actual code
isn't commented, but it's "fairly" readable.
I'm putting the syncing code in its own class, because, well, that's what my
brain says I should do. The only initialization argument it requires is a
the appropriate row in Wistle::Model. It only provides one other public
method, #run, which runs the updating, going through the following steps
- Connect to the repository. See #connect, #context, and #callbacks private
methods. Most of what's going on here is dealing with different
authentication options. Honestly, I don't have a solid understanding of this
bit.
- Check if we have updated to the last revision already. If so, quit.
- Run the repository's #log method. This gets information about each commit,
starting with the most recent; I've specified to get revisions only through
the last update (stored in Wistle::Model#revision). Store this information
in the variable changesets.
- Reverse changesets and run #do_changeset on each element.
SvnSync#do_changeset actually updates the database. For each change in the changeset:
- It determines whether the change was one I'm interested in, and if so, what
kind of change. There are three types of interest: moves, modifications/adds,
and deletes.
- Moves are the most problematic, mostly because Subversion doesn't really have
a "move" concept. Instead were looking for a node that was copied for another
node in the same changeset that the latter node was deleted. In this case,
as opposed to "just a copy", I don't want to create a new entry in the
database, but rather modify the path of the existing entry. Why? To not
invalidate foreign keys, i.e. to keep comments listed with the article after
it's renamed.
- Next, do any deletes. It's possible we won't find the node to delete, either
because it was actually a move, or because it refers to a file we don't keep
track of. In that case, just continue on with the next delete.
- Modify/Add/Replace: In all these cases, what I want is to update the content
of the appropriate row, creating a new row if needed. The private
method #get is responsible for finding the appropriate row, based on the path. This
updates contents and other properties, both those specified by the revision
and the actual node properties.
- When all changes have been processed, update the Wistle::Model row with the
new current revision.
If you aren't familiar with the SWIG bindings, the code will probably be a bit
confusing, but hopefully the outline above will help clarify what's going on.
More to the point, I hope it illustrates that ORM's are not the only available
storage mechanisms for web apps.
So, the code (yikes):
module Wistle
class SvnSync
def initialize(model_row)
@model_row = model_row
@model = Object.const_get(@model_row.name)
@config = @model_row.config
end
# There is the possibility for uneccessary updates, as a database row may be
# modified several times (if modified in multiple revisions) in a single
# call. This is inefficient, but--for now--not enough to justify more
# complex code.
def run
connect unless @repos
return false if @repos.latest_revnum <= @model_row.revision
changesets = [] # TODO Maybe revision + 1
@repos.log(@path_from_root, @repos.latest_revnum, @model_row.revision, 0, true, false
) do |changes, rev, author, date, msg|
changesets << [changes, rev, author, date]
end
changesets.sort{ |a, b| a[1] <=> b[1] }.each do |c| # Sort by revision
do_changset(*c)
end
return true
end
private
# Get the relative path from config.uri
def short_path(path)
path = path[@path_from_root.length..-1]
path = path[1..-1] if path[0] == ?/
path.sub!(/\.#{@config.extension}\Z/, '') if @config.extension
path
end
# Get an object of the @model, by path.
def get(path)
@model.first(:path => short_path(path))
end
# Create a new object of the @model
def new_record
@model.new
end
# Process a single changset.
# This doesn't account for possible move/replace conflicts (A node is moved,
# then the old node is replaced by a new one). I assume those are rare
# enough that I won't code around them, for now.
def do_changset(changes, rev, author, date)
modified, deleted, copied = [], [], []
changes.each_pair do |path, change|
next if short_path(path).blank?
case change.action
when "M", "A", "R" # Modified, Added or Replaced
modified << path if @repos.stat(path, rev).file?
when "D"
deleted << path
end
copied << [path, change.copyfrom_path] if change.copyfrom_path
end
# Perform moves
copied.each do |copy|
del = deleted.find { |d| d == copy[1] }
if del
# Change the path. No need to perform other updates, as this is an
# "A" or "R" and thus is in the +modified+ Array.
record = get(del)
record.update_attributes(:path => short_path(copy[0])) if record
end
end
# Perform deletes
deleted.each do |path|
record = get(path)
record.destroy if record # May have been moved or refer to a directory
end
# Perform modifies and adds
modified.each do |path|
next if @config.extension && path !~ /\.#{@config.extension}\Z/
record = get(path) || new_record
svn_file = @repos.file(path, rev)
# update body
record.__send__("#{@config.body_property}=", svn_file[0])
# update node props -- just find any props with property_prefix
svn_file[1].each do |name, val|
if name =~ /\A#{@config.property_prefix}(.*)/
record.__send__("#{$1}=", val)
end
end
# update revision props
record.path = short_path(path)
record.svn_updated_at = date
record.svn_updated_rev = rev
record.svn_updated_by = author
if record.new_record?
record.svn_created_at = date
record.svn_created_rev = rev
record.svn_created_by = author
end
record.save
end
# Update model_row.revision
@model_row.update_attributes(:revision => rev)
end
def connect
@ctx = context
# This will raise some error if connection fails for whatever reason.
# I don't currently see a reason to handle connection errors here, as I
# assume the best handling would be to raise another error.
@repos = ::Svn::Ra::Session.open(@config.uri, {}, callbacks)
@path_from_root = @config.uri[(@repos.repos_root.length)..-1]
return true
end
def context
# Client::Context, which paticularly holds an auth_baton.
ctx = ::Svn::Client::Context.new
if @config.username && @config.password
# TODO: What if another provider type is needed? Is this plausible?
ctx.add_simple_prompt_provider(0) do |cred, realm, username, may_save|
cred.username = @config.username
cred.password = @config.password
end
elsif URI.parse(@config.uri).scheme == "file"
ctx.add_username_prompt_provider(0) do |cred, realm, username, may_save|
cred.username = @config.username || "ANON"
end
else
ctx.auth_baton = ::Svn::Core::AuthBaton.new()
end
ctx
end
# callbacks for Svn::Ra::Session.open. This includes the client +context+.
def callbacks
::Svn::Ra::Callbacks.new(@ctx.auth_baton)
end
end
end
Time to hook the pieces together.
An update to Wistle::Svn, to add the .sync class method to including models:
module Wistle::Svn
module ClassMethods
def sync
Wistle::SvnSync.new(svn_repository).run
end
end
end
In Article, after including DataMapper::Resource,
include Wistle::Svn.
Run rake dm:db:automigrate to add in Wistle::Svn's properties to
Article.
And, now, to make the sync's happen. I'm going to go with one sync for every
Request, for now. This may prove to be terribly inefficient (the connect code
to the Subversion repository is not cheap), but if so, I'll change it later.
So, a nice before filter in Application should do the trick.
class Application < Merb::Controller
before :sync_articles
protected
def sync_articles
Article.sync
end
end
Finally, I'm going to remove all methods and associated views from Articles that
can update an Article, i.e. new, create, edit, update and destroy.
And, well...that's it. Well, you do need to set up in appropriate Wistle::Config
in Article (or in database.yml).
Revision 48
Permalink
0 Comments
Fri, 2008 Aug 15
Posted in Wistle
at 09:00
by jmorgan
So, I decided to create a blogging application. After all,
Typo was pretty nice and I've quite happily used (and
abused) Mephisto for some time. And of course,
there's a thousand other options out there. But, over the last few years, I've
developed a wish list, so:
- Store the actual articles in a source code repository, ideally Subversion
(or maybe git, but I'm much more comfortable with Subversion).
- Store views in the same repo as the articles (or, at least, separate from the
app itself. I don't have a particularly good reason for this, but point #6
will take care of that).
- Views in anything other than liquid. I
mean, I just can't stand it. I understand it's purpose and it's great and
all, but I wanna' write my view code in Ruby..or PHP..or VBA..or Lisp..or
that programming language that's all in whitespace..or... Phhbbt!
- By default, set up for multiple sites hosted by a single app.
- Easy to add content filters, like Markdown and Textile, but also including
my own (This is actually pretty easy in Mephisto, but again, point #6).
- I wanted a relatively simple but challenging project to do on my own, so I
mostly made up 1-5 to justify it. Hey-o!
I picked Merb for the framework and
DataMapper for the ORM, mostly because I've been
experimenting with these lately. In addition, they feel more flexible than
Rails for doing stuff like points one and two, and because I can't stand the
font on the DataMapper site. Hello? WTF is that? "Font-that-looks-like-I-
designed-it-in-MS-Paint? While fending off a horde of rabid chihauhas?"
Seriously.
Oh, it's a "humanist sans-serif typeface".
Good to know.
Anywho, I thought I'd walk you [insert your name here] through the process,
partly because it has some fun, not-normally-tutorial-stuff aspects, partly as
yet another intro to the rapidly changing worlds of Merb and DataMapper. Mostly
because I just felt like it.
I'm also going to try to do this in discrete stages, adding elements of the
above requirements as I go. I think that will work okay. Be agiley and s---.
Repository note
I have set up a project on Google code at
http://code.google.com/p/wistle/, for you
to actually view the code. I'll also reference particular revisions. However,
big note, there's some errors in the code, and some things changed because
Merb and DataMapper are changing, and because I'm learning. So, things might
not work for you, and what's in these articles may be different from what's
in the repository. Hopefully, this won't be too big a problem,
because--hopefully--these blog entries will explain enough to start you on the
right path to figuring out what's wrong.
Here goes.
Generate the app
NB: If you're trying on Windows...good luck. By the way, if you use a Windows
machine, for whatever reason, colinux can be your friend.
- Get the gems. See the respective sites (Merb,
DataMapper).
merb-gen app wistle (No, I don't know why we're calling this app
'wistle'. It's what I happened to type.
Hopefully, that worked. If not, do some google searches; depending on the day,
everything may work or fail terribly.
We need to pause for some configuration. The
file we want is config/init.rb. Not a lot of options, but they're well
commented. All I'm going to do is uncomment two lines.
use_orm :datamapper and use_test :datamapper. This
just tells Merb that I'm using DataMapper and RSpec. I assume it uses this
knowledge to load appropriate libraries or something. I don't really know.
The other bit-o-configuration we need is database.yml. I like to stick with
sqlite for development unless I'm intending to use features specific to a given
database.
:development: &defaults
:adapter: sqlite3
:database: db/dev.db
:test:
<<: *defaults
:database: db/test.db
:production:
<<: *defaults
:database: db/pro.db
And, before you start scaffolding, create a db directory if there's not one
already.
Revision 5
Scaffolding
Yeah, I know, scaffolding sucks, but it's a quick way to get some working code,
because this ain't the interesting bit. First, though, what models/resources
do I need?
- Article, for the actual articles. This one will become interesting later.
- Comment, for people to leave comments.
- Site, to specify each different site. But I'm going to leave out the
multi-site requirement for now.
- User? Nope, I'm not going to bother. Since I know that my article editing
will ultimately happen on some text editor and be committed to a Subversion
repository, I don't have a need for User accounts. I could add it later,
say, if I wanted commenters to have accounts or something. Oh, as an aside,
after having completed several Rails apps, the only thing interesting about
user accounts are the passwords. For some reason, this gets overcomplicated.
So, just Article and Comment. And nothing too fancy. Note that the underscore
in date_time matters. Otherwise, you're liable to get a constant missing error.
Another gotcha is that an "id" field is not generated automatically.
merb-gen resource Article
id:integer
title:string
body:text
published_at:date_time
comments_allowed_at:date_time
created_at:date_time
updated_at:date_time
merb-gen resource Comment
id:integer
author:string
email:string
body:text
article_id:integer
parent_id:integer
created_at:date_time
updated_at:date_time
We have several more things to do before we can really get the app running. The
first is routing. I understand that Merb's router is quite powerful. But, I'm
not intending to venture there for now.
I want the actual code of router.rb to look like this for now (just using REST
routing for the two models just created). I'll update this a bit as time goes
on.
Merb::Router.prepare do |r|
r.resources :articles
r.resources :comments
r.default_routes
end
Next, specify that id is the primary key for both tables. So, in each model,
change the line property :id, Integer to
property :id, Integer, :serial => true, thus telling DataMapper
that id is an auto-numbering primary key.
Then, migrate the database. Yay, no migration files! This is probably a personal
preference, but I really like specifying the tables fields within the model.
rake dm:db:automigrate
The next was a surprise to me. Apparently, link_to is now in the
"merb-assets" gem and must be required explicitly (Thanks to this article for
the solution. Likewise,
"error_messages_for" is in "merb_helpers" (You may need to
gem install merb_helpers). So, add to init.rb
dependencies "merb_helpers", "merb-assets".
To start the app, the command is, well, merb. Add a "-p ####" to specify a
port other than 3000.
So, play around, check out the scaffolded code, yadda, yadda.
Revision 11
Clean Up the Scaffolding
The next step is to get the app working like I want it, without messing with
the storage in Subversion stuff. One thing to note is that I'm not going to
address "look and feel" in this article. (Except sort of at the tail-end).
I generally like to start with the models, although I don't really have an
"approach". Oh, and I don't plan on going over specs/tests in this article,
although I'll be writing some (probably less than some people would prefer).
Anyway, first stop is making the properties in the models work just like I want
them.
Validations - I won't validate anything for the Article model because
editing will ultimately be done in Subversion, and, well, I generally don't
care to validate data that I personally will be inputting. But the Comments
will see some changes.
- First, we need 'dm-validations'. There's several places you could require
it (directly, in the model for example), but I'll add it as a dependency
in init.rb. For some reason, I had a version problem, so I specified it
explicitly:
dependency 'dm-validations', '= 0.9.1'. (Later,
I removed the version).
- Then, add some options to some of the properties. Add
:nullable => false to #body, #author and #article_id;
also, add :length => 100 to #author (Because I feel like it);
and :format => :email_address for #email.
By default, DataMapper validates based on this info. So, a
:nullable => false results in a
validates_present. Of course, you can use explicit
validations if desired or
needed.
I'm not sure how to disable the format validation (for email) when no
address has been supplied. So, I'll customize the setter.
def email=(val)
if val.blank?
attribute_set(:email, nil)
else
attribute_set(:email, val)
end
end
Lazy Loading - DM lazy loads Text fields by default. I don't anticipate
retrieving Articles or Comments without using their #body fields, so, add
:lazy => false option to the #body properties.
Relationships - Comments belong to a) an Article and b) possibly a parent
Comment. Associations look a
bit different if you're accustomed to ActiveRecord, but nothing too weird.
Here's the updates. Some of these associations have some extra options,
such as ordering and scope. Note particularly Article#direct_comments.
class Article
has n, :comments
has n, :direct_comments,
:class_name => 'Comment',
:order => [:created_at.asc],
:parent_id => nil
end
class Comment
belongs_to :article
belongs_to :parent,
:class_name => 'Comment',
:child_key => [:parent_id]
has n, :replies,
:class_name => 'Comment',
:child_key => [:parent_id],
:order => [:created_at.asc]
end
I want to be able to call @article.comments.count from my vies, so I need to
add a dependency 'dm-aggregates' in init.rb
Auto Times - I like the auto-updating created_at and updated_at in
AR. To get this in DataMapper, we just need to require "dm_timestamps".
dependency 'dm-timestamps' in init.rb is one way to do this.
Timestamp Booleans - One of my favorite little tricks are timestamp
columns that can operate as booleans. I have two in Article #published_at
and comments_allowed_at. I'll want the following methods: #published?
and #published=(Boolean) (and similar for #comments_allowed_at. Since I might
add similar columns later, I'll do some meta-programming here.
class Article
%w{published comments_allowed}.each do | col |
define_method("#{col}=") do |value|
value = false if (value == '0' || value == 0) # for checkboxes
# update only if the boolean value changed.
if (!value == __send__("#{col}?"))
attribute_set("#{col}_at", value ? Time.now : nil)
end
end
define_method("#{col}?") do
__send__("#{col}_at") ? true : false
end
end
end
#attribute_set is preferred to @attribute_name=(value) for "tracking
dirtiness".
Auto Migrate again - rake dm:db:automigrate. This will take
care of updating the database with those :nullable => false kind of property
options. I think this is destructive. rake dm:db:autoupgrade,
according to rake -T, is nondestructive. But I don't have any useful data
yet anyway.
Finally, in this "clean up the scaffolding" section, I want to look at the VC
side of MVC. There's a few things needed to match up the controllers with the
associations specified in the model. I'll also work on the views, although I
won't document any of that here. Merb, by the way, supports ERB and
HAML. I assume it supports other templating
engines; looking at the merb-haml gem, anyway, this doesn't look difficult. I'm
going to use HAML for now, because, hey, why not add on something else new.
But, the controller/routing changes. (Oh, and I'll ignore the edit and new views
for articles; they will after all disappear shortly).
HAML - Add a 'merb-haml' dependency in init.rb
Router - Basically, I just want to use REST routes (for now), with
comment routes nested in article routes. Also, add the default route.
Merb::Router.prepare do |r|
r.resources :articles do | article |
article.resources :comments
end
r.match('/').to(:controller => 'articles', :action =>'index')
end
Contents controller - I want to update the Contents controller to scope
requests by the article. The key here is a before filter. In this, I'll also
assign a parent Comment, if appropriate.
before :assign_article_and_parent
protected
def assign_article_and_parent
@article = Article.get(params[:article_id])
raise NotFound unless @article.nil?
@parent = Comment.get(params[:parent_id]) unless params[:parent_id].blank?
end
There's also some updates such as:
@comment = @article.comments.first(:id => params[:id])
and
@comment = Comment.new
@article.comments << @comment
@parent.replies << @comment if @parent
URLs also need to reflect the nested routing of comments. For example, the
redirect in #create becomes:
redirect url(:article_comment,
:article_id => @article.id,
:comment_id => @comment.id)
I also remove the edit, update and destroy actions. The only mechanism I
will provide for these for now is the console. This is just to avoid needing
an administrative area (even then, though, I'd probably just provide the
destroy option).
Articles controller - Finally, I want to limit Articles to those already
published. Again, a before filter would work, but I'm just going to create
an Article.published method, referenced in index. I could restrict the show
action also to only those published, but I'll leave it for previewing, at
least for now.
class Article
class << self
def published(options = {})
Article.all(options.merge(
:conditions => ["datetime(published_at) <= datetime('now')"],
:order => [:published_at.desc]))
end
end
end
Revision 33
Permalink
0 Comments