This post is a after-completion summary of my GSoC project GreenSMW
What was the idea of this project?
The original proposal can be found atĀ http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/nischayn22/1
The main deliverables proposed there were
- Validation of writes using a hash
- Caching of Special Pages
- IdentificationĀ and cachingĀ of frequently made queries inĀ Special:ExportRDF
- Improvements to SMW’s accesses to the database.
- Identification and caching of large inline queries or complex templates using memcache
- Profiling and documentation
What part of this has been achieved, what was left behind?
- Validation of writes using a hash — Done (was very easy to do, got completed in very early)
- Improvements to SMW’s accesses to the database. — Done (this was the most complicated task as it involved lots of refactoring of old code)
- Caching of Special Pages — An alternative strategy is being applied here, the Special page methods are made very efficient and now don’t need any caching as such. (yet to commit this change)
- IdentificationĀ and cachingĀ of frequently made queries inĀ Special:ExportRDF — This was later identified as very low priority as many more places were identified to improve.
- Identification and caching of large inline queries or complex templates using memcache — This task was later identified as not so trivial, memcache uses time based caching, which is not a good solution for query as they involve lots of invalidation. We planned to work on a different technique to invalidate queries by storing their metadata, this is a bigger task and we decided to do it post GSoC. However, users can use a memcache based approach till then as MWJames has been usingĀ http://wikimedia.7.n6.nabble.com/Re-Query-result-caching-and-invalidation-Jeroen-De-Dauw-td4981469.html#none
- Profiling and documentation — Mostly done, but more part to be done when SMW 1.8 is going to be released.
What was not in the plan (we don’t have plans for everything, do we?)
- Unit Tests — We covered some parts of SMW’s code using PHPUnit tests.
- Fixed Properties — Side product of re-organizing the DB stuff, wiki admins can assign separate tables for highly used properties so querying takes little time on those.
- Migration Script — A script to let users actually switch to SMW 1.8 without disrupting their site’s activity.
- Semantic diff and site stats — Not fully mature stuff, Ā but SMW will now be able to produce a diff of the Semantic data, and also store stats of Property usage.
What do you consider the best aspect of participating in GSoC?
The best aspect of participating was contributing to a project that hundreds of people use. Besides, this opportunity gave me immense exposure to the process of Software Development in Open Source
What do you consider the most challenging part of your summer?
Working with existing code was a challenge. I changed something here and it broke something there, such issues occurred many times.
How were your mentors?
Awesome, having two mentors was really beneficial.
Which tips would you give to future students?
Talk to previous year students, talk to mentors as early as possible. Don’t be intimidated by big source codes š
What one thing did the Wikimedia community do that you consider very
helpful for your project and would suggest they continue to do?
Developers at Wikimedia have been very helpful throughout, they maintain a friendly atmosphere that welcomes more contributors.Ā I am also thankful to WikimediaĀ Deutschland for funding my travel to SMWCon in Germany.