Hosted Email Servers

January 18, 2010

An Evolutionary Genetic Algorithm for Optimization of Distributed Database Queries

High-performance low-cost PC hardware and high-speed LAN/WAN technologies make distributed database (DDB) systems an attractive research area where query optimization and DDB design are the two important and related problems. Since dynamic programming is not feasible for optimizing queries in a DDB, we propose a new genetic algorithm (GA)-based query optimizer (new genetic algorithm (NGA)) and compare its performance with random and optimal (exhaustive) algorithms. We perform experiments on a synthetic database with replicated relations, but no horizontal or vertical fragmentation. Network links are assumed to be gigabit ethernet. Comparisons with optimal results show that our NGA formulation performs only 20% of the optimal results and we have achieved 50% improvement over a previous GA-based algorithm.

Post to Twitter Tweet This Post

Streaming Covariance Selection with Applications to Adaptive Querying in Sensor Networks

Sensor networks can be naturally represented as graphical models, where the edge set encodes the presence of sparsity in the correlation structure between sensors. Such graphical representations can be valuable for information mining purposes as well as for optimizing bandwidth and battery usage with minimal loss of estimation accuracy. We use a computationally efficient technique for estimating sparse graphical models which fits a sparse linear regression locally at each node of the graph via the Lasso estimator. Using a recently suggested online, temporally adaptive implementation of the Lasso, we propose an algorithm for streaming graphical model selection over sensor networks. With battery consumption minimization applications in mind, we use this algorithm as the basis of an adaptive querying scheme. We discuss implementation issues in the context of environmental monitoring using sensor networks, where the objective is short-term forecasting of local wind direction. The algorithm is tested against real UK weather data and conclusions are drawn about certain tradeoffs inherent in decentralized sensor networks data analysis.

Post to Twitter Tweet This Post

Internet Failures: an Emergent Sea of Complex Systems and Critical Design Errors?

Complex systems researchers have looked to the Internet as a possible source of interesting emergent behaviour. Indeed, some high-profile failures, and some low-level phenomena, might easily be construed as evidence of a complex system. In this paper, I look at the local and global consequences of the Internet design, and show that few, if any, of these problems are actually consequences of emergent properties in the pure technical sense. However, there are lessons for network architecture from these problems. The influence of local decisions on global behaviour of the network is a source of some of the difficulties that protocol designers must cope with, but it is also a source of great wealth and innovation, and as such should be regarded in a positive light.

Post to Twitter Tweet This Post

Prospects for Bandit Solutions in Sensor Management

Sensor management in information-rich and dynamic environments can be posed as a sequential action selection problem with side information. To study such problems we employ the dynamic multi-armed bandit with covariates framework. In this generalization of the multi-armed bandit, the expected rewards are time-varying linear functions of the covariate vector. The learning goal is to associate the covariate with the optimal action at each instance, essentially learning to partition the covariate space adaptively. Applications of sensor management are frequently in environments in which the precise nature of the dynamics is unknown. In such settings, the sensor manager tracks the evolving environment by observing only the covariates and the consequences of the selected actions. This creates difficulties not encountered in static problems, and changes the exploitation–exploration dilemma. We study the relationship between the different factors of the problem and provide interesting insights. The impact of the environment dynamics on the action selection problem is influenced by the covariate dimensionality. We present the surprising result that strategies that perform very little or no exploration perform surprisingly well in dynamic environments.

Post to Twitter Tweet This Post

Generic Text Summarization for Turkish

Filed under: Computer news — Tags: , — admin @ 10:55 am

In this paper, we propose a generic text summarization method that generates summaries of Turkish texts by ranking sentences according to their scores. Sentence scores are calculated using their surface-level features, and summaries are created by extracting the highest ranked sentences from the original documents. To extract sentences which form a summary with an extensive coverage of the main content of the text and less redundancy, we use features such as term frequency, key phrase (KP), centrality, title similarity and sentence position. The sentence rank is computed using a score function that uses its feature values and the weights of the features. The best feature weights are learned using machine-learning techniques with the help of human-constructed summaries. Performance evaluation is conducted by comparing summarization outputs with manual summaries of two newly created Turkish data sets. This paper presents one of the first Turkish summarization systems, and its results are promising. We introduce the usage of KP as a surface-level feature in text summarization, and we show the effectiveness of the centrality feature in text summarization. The effectiveness of the features in Turkish text summarization is also analyzed in detail.

Post to Twitter Tweet This Post

Networks of Concepts and Ideas

Filed under: Computer news — Tags: , — admin @ 10:55 am

We present the results of an experiment designed to investigate the way information is organized and stored in the human brain. In particular, we are using controlled stimuli to reverse engineer the networks of ideas and concepts in order to answer the following questions. (1) Are the networks of ideas and concepts in the human brain invoked by verbal and visual stimuli distinct from each other? The answer appears to be no for the network of ideas and inconclusive for the network of concepts. (2) What is the topology of these networks? Our experimental results show that both are small-world networks, with the network of ideas being random and the network of concepts scale-free.

Post to Twitter Tweet This Post

Implementing a Thermal-Aware Scheduler in Linux Kernel on a Multi-Core Processor

As power dissipation causes thermal issues in cooling costs, lifetime and reliability, thermal management has become an important issue in today’s OS and processor design. Early OS-level thermal management schemes were proposed and evaluated mainly with simulators or analytical models. In this paper, we implement a thermal-aware round-robin scheduling algorithm in the Linux kernel, and compare its performance with the ‘Heat-and-Run’ algorithm and the default Linux baseline scheduler on an Intel Core 2 Duo processor using representative benchmarks from SPEC2000, MiBench and NetBench. Our results indicate that the current Linux scheduler can easily be enhanced with thermal-awareness to show improved performance in terms of both the on-chip temperature condition and application throughput.

Post to Twitter Tweet This Post

NVR-BIP: Nuclear Vector Replacement using Binary Integer Programming for NMR Structure-Based Assignments

Nuclear magnetic resonance (NMR) spectroscopy is an important experimental technique that allows one to study protein structure and dynamics in solution. An important bottleneck in NMR protein structure determination is the assignment of NMR peaks to the corresponding nuclei. Structure-based assignment (SBA) aims to solve this problem with the help of a template protein which is homologous to the target and has applications in the study of structure–activity relationship, protein–protein and protein–ligand interactions. We formulate SBA as a linear assignment problem with additional nuclear overhauser effect constraints, which can be solved within nuclear vector replacement’s (NVR) framework (Langmead, C., Yan, A., Lilien, R., Wang, L. and Donald, B. (2003) A Polynomial-Time Nuclear Vector Replacement Algorithm for Automated NMR Resonance Assignments. Proc. the 7th Annual Int. Conf. Research in Computational Molecular Biology (RECOMB), Berlin, Germany, April 10–13, pp. 176–187. ACM Press, New York, NY. J. Comp. Bio., (2004), 11, pp. 277–298; Langmead, C. and Donald, B. (2004) An expectation/maximization nuclear vector replacement algorithm for automated NMR resonance assignments. J. Biomol. NMR, 29, 111–138). Our approach uses NVR’s scoring function and data types and also gives the option of using CH and NH residual dipolar coupling (RDCs), instead of NH RDCs which NVR requires. We test our technique on NVR’s data set as well as on four new proteins. Our results are comparable to NVR’s assignment accuracy on NVR’s test set, but higher on novel proteins. Our approach allows partial assignments. It is also complete and can return the optimum as well as near-optimum assignments. Furthermore, it allows us to analyze the information content of each data type and is easily extendable to accept new forms of input data, such as additional RDCs.

Post to Twitter Tweet This Post

A Necessary and Sufficient Condition for the Liveness of Normal Nets

This paper gives a necessary and sufficient condition for the liveness of normal nets, i.e. a normal net with a given initial marking is live if and only if it is structurally repetitive and each minimal siphon is marked in any reachable marking. Furthermore, it is proved that a normal net is structurally live if and only if it is structurally repetitive. Finally, we prove that a weakly persistent net, which is a special normal net, is live for a given initial marking if and only if it is structurally repetitive and each minimal siphon is marked in the initial marking. That is to say, the liveness of weakly persistent nets can be decided by the net structure and the initial marking only.

Post to Twitter Tweet This Post

An Analysis of Associated Dividends in the DBM Algorithm for Division by Constants Using Multiplication

When a compiler encounters a fixed integer divisor, which is not a power of 2, it can calculate its inverse to be multiplied by the run-time integer dividends to obtain the quotients, using our very efficient, recently published [Cavagnino, D. and Werbrouck, A.E. (2008) Efficient algorithms for integer division by constants using multiplication. Comp. J., 51, 470–480] division by multiplication algorithms. Essentially our algorithms permit a complete partition of a defined number space into non-adverse and adverse divisors on the basis of whether a dividend associated with each divisor is, or is not, greater than the maximum dividend size. In this paper, we demonstrate useful relations between the dividends associated with all divisors and with their multiples by positive powers of 2 leading to rapid iterative algorithms for calculating full sets of associated dividends.

Post to Twitter Tweet This Post

Older Posts »

Powered by WordPress