RSS Entries (RSS) and Comments (RSS)  

LOD cloud shows surprisingly lumpy structure

March 10th, 2010

The protypical Linked Open Data map gives the general impression of a richly interlinked set of bubbles.However, A small experiment showed that this first impression is very wrong!  Christophe Gueret from the VU Amsterdam re-constructed the LOD link table as a .net file, which we then displayed using simple stress minimisation in Visione. This revealed some surprises:

(click on image for bigger copy)

Surprise 1: There is not one cloud, but three. As this graph visualisation shows, LOD is not one cloud, but three, each with dense internal connections and only sparse connections between them. The three sub-clouds are also clearly recognisable: one sub-cloud is bio/life-sciences data, one sub-cloud is (surprisingly) academic bibliographic material, and the central cloud is “all the rest”, connecting the other two, with DBPedia as its hub.

Surprise 2. DBLP is as important as DBPedia. Also surprisingly, the total betweenness degree of the relatively unknown DBLP datasets is as high as the betweenness degree of the widely recognised DBPedia hub.  The sum of three DBLP instances accounts for 25% of the betweenness, almost the same number as DBPedia (28%). The reason for this high betweenness is that the DBLP sets are the only link between the bibliographic subcloud and “the rest”.

So now the questions are: Is this good or bad? Is this surprising or obvious? Is this long-term structural or just a short-term coincidence. Anybody?(a first experiment would be to take the density of the links between the bubbles into account, and see if this would change anything? The .net file is here for you to experiment with, please share your results).

Reblog this post [with Zemanta]

LarKC partner Saltlux receives national “Grand Software Award”

March 3rd, 2010

Only a day after LarKC partner Ontotext received the Pitagoras award, LarKC partner Saltlux received the “Grand Software Award” from the Korean Ministry of Knowledge Economy! The prize was awarded for their semantic search engine [IN2]Discovery.

Saltlux plays a key role in the Urban Computing case-study of LarKC, and is planning a Korean version of the current Milan-based LarKC workflo.

LarKC partner OntoText receives Pitagoras Award

February 27th, 2010

LarKC partner Ontotext, part of Sirma Group, receives the prestigious scientific Pitagoras award for “a company which has mastered new scientific research in the most successful way or provided specialized services for public benefit.” The awards are presented by the Ministry of Education, Youth and Science to honor the most contributory scientists and bodies in Bulgaria in 11 categories. The annual prizes originated back in 2003.

“This is a great acknowledgement for all of us in Ontotext, for our efforts to prove that a small Bulgarian software company could participate in scientific research on a global scale, to develop software products and to successfully compete the leading companies worldwide”, the company manager Atanas Kiriakov stated.

The company is known for several products: KIM - the most popular platform for semantic annotation and search and semantic database OWLIM, proved as the fastest and most scalable RDF (S)/OWL system.  Over the past nine years Ontotext established itself as an important participant in several significant open source projects like GATE and Sesame. 

OntoText is a core partner of the LarKC project, providing scalable solutions to data storage and querying, and playing a key role in the pharmaceutical case-study. 

http://www.24-7pressrelease.com/press-release/ontotext-software-company-received-the-pitagoras-award-138949.php

Reblog this post [with Zemanta]

LarKC’s WebPIE Inference Engine makes it to the final rounds of the SCALE 2010 Challenge

February 24th, 2010

IEEE LogoImage via Wikipedia

A team from the VU University Amsterdam, consisting of LarKC members and members of Prof. Henri Bal’s Distributed Computing Group, has succeeded in getting through to the final rounds of the SCALE 2010 Challenge, with their entry entitled ”

WebPIE: a Web-scale Parallel

Inference Engine“.

 SCALE is the third IEEE International Scalable Computing Challenge, sponsored by theIEEE Computer Society Technical Committee on Scalable Computing.  The objective of the SCALE Challenge is to highlight and showcase real-worldproblem solving using computing that scales.  Participants in the challenge are expected to identify significant current real-world problems where scalable computing techniques can be effectively used, and design, implement, evaluate and demonstrate solutions. The SCALE Challenge is co-located with CCGRID, the 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

The VUA team submitted the work lead by Jacopo Urbani and Spyros Kotoulas on WebPIE, an inference engine that can perform at Webscale. This MapReduce-based inference engine performs OWL Horst inference on datasets up to 100billion triples, as earlier reported here and here.

This selection for SCALE shows that the results achieved by WebPIE are not only interesting from a Semantic Web perspective, but are also attracting interest from the distributed computing community. The final rounds of the SCALE competition will take place 17-20 May in Melbourne.

ARTIFICIAL INTELLIGENCE AND SYMBOLIC COMPUTATION

February 24th, 2010

This conference is a perfect venue to show off uses of the LarKC platform, or to report on new techniques that might serve as plug-ins. And its deadline has been extended.                    

   CALL FOR PAPERS
            AISC 2010 - 10th International Conference on
          ARTIFICIAL INTELLIGENCE AND SYMBOLIC COMPUTATION
              Theory, Implementations and Applications
           http://cicm2010.cnam.fr/aisc/
           CNAM, Paris, France, July 5th - July 6th, 2010
**********************************************************************
DEADLINE EXTENSION: - Abstracts:      March 9, 2010
                   - Full papers:   March 12, 2010

Reblog this post [with Zemanta]

Easier plug-ins with Java 7

February 15th, 2010

(by Michael Witbrock, Cycorp Europe)

Example of a Plug-In FrameworkImage via Wikipedia

Image via Wikipedia

The LarKC architecture separates the platform, which provides services to support large scale inference, from the plug-ins, which implement novel methods of inference, data identification, data selection, transformation and work flow control. But maintaining this clean separation has been a little awkward, since one has to do a little bit of work to register new plug-ins with the platform or de-register them. This process looks set to get easier with Java 7, whose new watch service will make it easy to detect file system events, independent of OS.  Registering a plug-in should be as easy as dropping its JAR in a directory; De-Registering as easy as moving the JAR elsewhere.

Reblog this post [with Zemanta]

Expert Meeting on Semantic Data Management Systems

February 10th, 2010

SemData is a new initiative to get together the major players in Semantic Data Management systems, both academic research groups and commercial vendors, large and small. The goal is to have expert discussions on issues such as semantic repositories, their virtualization and distribution, scalability, interoperability with relational stores, benchmarking, etc.SemData will organise a series of meetings.The first one will happen March 11-12 in Sofia. Major academic and industrial players have already committed to attend. Meetings later in the year will be co-located with ESWC in Heraklion in June, and VLDB in Singapore in September.

Linked Life Data 0.4.1 released

February 4th, 2010

 

Linked Life Data

LarKC’s Linked Life Data development team is proud to announce the 0.4.1 release of LLD service. LLD is a public RDF warehouse that semantically integrates more than 20 popular biomedical data sources. The release contains 4,179,999,703 statements that connect 579,309,731 RDF resources.Some of the new features are:

  • Major UI face-lift
  • UMLS concept auto-complete search
  • New default search interface that combines UMLS concept auto-complete, RDF explore, and full-text search
  • Extension of SPARQL with Lucene full-text search index, example: select * where {?s <http://www.ontotext.com/luceneQuery “<lucene query>”}
  • Simplified semantic annotation scheme so that only one predicate is used to link document and concept (http://linkedlifedata.com/resource/lifeskim/mentions)
  • Revision of the UMLS semantic types that are used
  • Altered schema of PubMed document
  • Mappings between EntrezGene and all other sources that mention genes
  • Mappings between Taxonomy and all other sources that mention organisms
  • Mappings between GO and all other sources that mention gene or gene product annotation terms
  • Many fixes in the data source processing (encoding, incorrect URI generation, non-conventional URIs)

The service is accessible at http://linkedlifedata.comand http://linkedlifedata.com/openrdf-sesame/repositories/owlim?query=

Reblog this post [with Zemanta]

3rd LarKC Early Adopters Tutorial at ESWC 2010

January 27th, 2010

(by Mick Kerrigan, STI Innsbruck)

The 3rd edition of the LarKC Early Adopters Tutorial has been accepted at the 7th Extended Semantic Web Conference (ESWC10). The tutorial as always will be a full day event that provides participants with early access to the results of the LarKC project. Having completed the tutorial, participants will be able to build plug-ins to the LarKC platform, create new workflows by combining existing LarKC plug-ins together, and will have a thorough overview of the LarKC approach. The tutorial will take place on the 30th of May 2010 on the island of Crete in Greece and registration for the tutorial can be made through the ESWC10 website. We have learned a lot about how to present the LarKC content at the previous tutorials at ESWC09 and ISWC09, so if you have not been able to attend a tutorial so far we hope you will be able to join us in Crete! More information on the agenda for the tutorial can be found on the early adopters section of the LarKC website.

Computer Scientists, Using your Professional Interests at Web-scale

January 27th, 2010

(by Yi Zeng, LarKC WICI team)

In the research and engineering efforts every day, (Computer) scientists always try to acquire new knowledge and technology trends from multiple data sources such as DBLP, ACM and IEEE digital libaries, CiteSeerX, Google Scholar, Amazon and many more.

From one perspective, personalized search is expected in each of these systems which hold large scale scientific data (Namely, scalability is supposed to be solved by diversity of user needs), and even if they provide you the functionality, you cannot carry your interest profiles from one to another. From another, for the study of user profile, in many cases, the time factor is not always considered which may cause negative effects.

In the LarKC project, we use Cognitive Memory retention like models (which consider both frequency and recency of interests) to acquire computer scientists’ (your) retained interests based on their (your) publication in DBLP and represent them in RDF (with FOAF vocabularies, more explanations can be found here). And we are willing to share these professional interests data to the Semantic Web community world wide.

Now, if you want to create some cross data source applications and you want to use the professional interests information everywhere, please do so, and the Computer Scientists Retained Interests RDF data is truly downloadable!The Computer Science Retained Interests RDF data is going to be updated very often, and now, for the 615124 computer scientists in DBLP, your retained interests are available for you and your potential service providers. Next time you buy books on Amazon, let your DBLP interest profile helps you to refine your query and then the books which are more relevant to your background will be provided to you. Next time when you have left a field of study for several years, but you did not notice that, let your scientific literature systems notify you there are several interesting things going on right there.

In LarKC, we started the efforts last year, but we are not gonna stop here. Much more refined versions are going to be provided within this year, and without waiting for a long time, we are going to serve the Medical Scientists!