all news and information will be posted on twitter, this blog has been moved to twitter.
Archive for the ‘hbase key-value store’ Category
all news and information will be posted on twitter
Thursday, February 25th, 2010java程序远程读写Hbase数据库的两种实现方法[转]
Friday, February 5th, 2010关键字: hbase 远程连接
两种方法,第一种是通过添加hbase-default.xml和hbase-site.xml到你的工程的classpath路径下。
另一种方法是在程序中制定,例如:
- HBaseConfiguration conf = new HBaseConfiguration();
- conf.set(“hbase.master”,”192.168.2.38:60000″);
- HTable table = new HTable(conf, ”blogposts”);
HBaseConfiguration conf = new HBaseConfiguration();
conf.set("hbase.master","192.168.2.38:60000");
HTable table = new HTable(conf, "blogposts");
GQL Reference
Wednesday, October 14th, 2009GQL is a SQL-like language for retrieving entities or keys from the App Engine scalable datastore. While GQL’s features are different from those of a query language for a traditional relational database, the GQL syntax is similar to that of SQL.
The GQL syntax can be summarized as follows:
SELECT [* | __key__] FROM <kind>
[WHERE <condition> [AND <condition> ...]]
[ORDER BY <property> [ASC | DESC] [, <property> [ASC | DESC] ...]]
[LIMIT [<offset>,]<count>]
[OFFSET <offset>]
<condition> := <property> {< | <= | > | >= | = | != } <value>
<condition> := <property> IN <list>
<condition> := ANCESTOR IS <entity or key>
As with SQL, GQL keywords are case insensitive. Kind and property names are case sensitive.
A GQL query returns zero or more entities or Keys of the requested kind. Every GQL query always begins with either
or
, followed by the name of the kind. (A GQL query cannot perform a SQL-like “join” query.)
Tip:
queries are faster and cost less CPU than
queries.
The optional
clause filters the result set to those entities that meet one or more conditions. Each condition compares a property of the entity with a value using a comparison operator. If multiple conditions are given with the
keyword, then an entity must meet all of the conditions to be returned by the query. GQL does not have an
operator. However, it does have an
operator, which provides a limited form of
.
The
operator compares value of a property to each item in a list. The
operator is equivalent to many
queries, one for each value, that are ORed together. An entity whose value for the given property equals any of the values in the list can be returned for the query.
Note: The
and
operators use multiple queries behind the scenes. For example, the
operator executes a separate underlying datastore query for every item in the list. The entities returned are a result of the cross-product of all the underlying datastore queries and are de-duplicated. A maximum of 30 datastore queries are allowed for any single GQL query.
A condition can also test whether an entity has a given entity as an ancestor, using the
operator. The value is a model instance or Key for the ancestor entity. For more information on ancestors, see Keys and Entity Groups.
The left-hand side of a comparison is always a property name. The right-hand side can be one of the following (as appropriate for the property’s data type):
- a
str
literal, as a single-quoted string. Single-quote characters in the string must be escaped as
''. For example:
'Joe''s Diner' - an integer or floating point number literal. For example:
42.7
- a Boolean literal, as
TRUE
or
FALSE.
- the
NULL
literal, which represents the null value (
Nonein Python).
- a datetime, date, or time literal, with either numeric values or a string representation, in the following forms:
-
DATETIME(<em>year</em>, <em>month</em>, <em>day</em>, <em>hour</em>, <em>minute</em>, <em>second</em>)
-
DATETIME('<em>YYYY-MM-DD HH:MM:SS</em>')
-
DATE(<em>year</em>, <em>month</em>, <em>day</em>)
-
DATE('<em>YYYY-MM-DD</em>')
-
TIME(<em>hour</em>, <em>minute</em>, <em>second</em>)
-
TIME('<em>HH:MM:SS</em>')
-
- an entity key literal, with either a string-encoded key or a complete path of kinds and key names/IDs:
-
KEY('<em>encoded key</em>')
-
KEY('<em>kind</em>', <em>'name'/ID</em> [, '<em>kind</em>', <em>'name'/ID</em>...])
-
- a User object literal, with the user’s email address:
USER('<em>email-address</em>')
- a GeoPt literal, with the latitude and longitude as floating point values:
GEOPT(<em>lat</em>, <em>long</em>)
- a bound parameter value. In the query string, positional parameters are referenced by number:
title = :1
Keyword parameters are referenced by name:
title = :mytitle
Note: conditions of the form
(which are equivalent) check to see whether a null value is explicitly stored in the datastore for that property. This is not the same as checking to see if the entity lacks any value for the property! Datastore queries which refer to a property never return entities which don’t have some value for that property.
Bound parameters can be bound as positional arguments or keyword arguments passed to the GqlQuery constructor or a Model class’s gql() method. Property data types that do not have corresponding value literal syntax must be specified using parameter binding, including the list data type. Parameter bindings can be re-bound with new values during the lifetime of the GqlQuery instance (such as to efficiently reuse a query) using the bind() method.
The optional
clause indicates that results should be returned sorted by the given properties, in either ascending (
) or descending (
) order. If the direction is not specified, it defaults to
. The
clause can specify multiple sort orders as a comma-delimited list, evaluated from left to right.
An optional
clause causes the query to stop returning results after the first
entities. The
can also include an
to skip that many results to find the first result to return. An optional
clause can specify an
if no
clause is present.
Note: A
clause has a maximum of 1000. If a limit larger than the maximum is specified, the maximum is used. This same maximum applies to the fetch() method of the GqlQuery class.
Note: Like the
parameter for the fetch() method, an
in a GQL query string does not reduce the number of entities fetched from the datastore. It only affects which results are returned by the fetch() method. A query with an offset has performance characteristics that correspond linearly with the offset size.
For information on executing GQL queries, binding parameters, and accessing results, see the GqlQuery class, and the Model.gql() class method.
Daemon service shell for Hypertable on CentOS 5.3
Friday, October 2nd, 2009#
# chkconfig: 2345 99 01
#
# description: This is a daemon which periodically for hypertable.
# processname: hypertabled
#
### BEGIN INIT INFO
# Provides: hypertabled
# Required-Start: $syslog $local_fs
# Required-Stop: $syslog $local_fs
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: hypertable daemon
# Description: This is a daemon which periodically for hypertable
### END INIT INFO
# source function library
. /etc/rc.d/init.d/functions
RETVAL=0
start() {
echo -n $"Starting hypertabled: "
su - hypertable -c "/opt/hypertable/0.9.2.7/bin/start-all-servers.sh kfs"
RETVAL=$?
echo
[ $RETVAL -eq 0 ]
}
stop() {
echo -n $"Stopping hypertabled: "
su - hypertable -c "/opt/hypertable/0.9.2.7/bin/stop-servers.sh"
echo
[ $RETVAL -eq 0 ]
}
restart() {
stop
start
}
case "$1" in
start)
start
;;
stop)
stop
;;
restart|force-reload|reload)
restart
;;
condrestart|try-restart)
restart
;;
status)
status ThriftBroker
status Hypertable.Master
status localBroker
status Hyperspace.Master
status Hypertable.RangeServer
RETVAL=$?
;;
*)
echo $"Usage: $0 {start|stop|status|restart|reload|force-reload|condrestart}"
exit 1
esac
exit $RETVAL
cd /etc/init.d
vi hypertabled
paste above code
save and exit (ESC +WQ)
chmod +x hypertabled
chkconfig –add hypertabled
setup (check if it’s already automatic service)
The Hypertable Query Language (HQL) SELECT Syntax
Friday, October 2nd, 2009version: Hypertable 0.9.2.7 (v0.9.2.7)
SELECT
EBNF
SELECT ('*' | column_family_name [',' column_family_name]*)
FROM table_name
[where_clause]
[options_spec]
where_clause:
WHERE where_predicate [AND where_predicate ...]
where_predicate:
cell_predicate
| row_predicate
| timestamp_predicate
relop: '=' | '<' | '<=' | '>' | '>=' | '=^'
cell_spec: row ',' column
cell_predicate:
[cell_spec relop] CELL relop cell_spec
| '(' [cell_spec relop] CELL relop cell_spec
(OR [cell_spec relop] CELL relop cell_spec)* ')'
row_predicate:
[row_key relop] ROW relop row_key
| '(' [row_key relop] ROW relop row_key
(OR [row_key relop] ROW relop row_key)* ')'
timestamp_predicate:
[timestamp relop] TIMESTAMP relop timestamp
options_spec:
(REVS revision_count
| LIMIT row_count
| INTO FILE filename[.gz]
| DISPLAY_TIMESTAMPS
| KEYS_ONLY
| NOESCAPE
| RETURN_DELETES)*
timestamp:
'YYYY-MM-DD HH:MM:SS[.nanoseconds]'
Description
The parser only accepts a single timestamp predicate. The ‘=^’ operator is the “starts with” operator. It will return all rows that have the same prefix as the operand.
Options
REVS revision_count
Each cell in a Hypertable table can have multiple timestamped revisions. By default all revisions of a cell are returned by the
statement. The
option allows control over the number of cell revisions returned. The cell revisions are stored in reverse-chronological order, so
will return the most recent version of the cell.
LIMIT row_count
Limits the number of rows returned by the
statement to
.
INTO FILE filename[.gz]
The result of a
command is displayed to standard output by default. The
option allows the output to get redirected to a file. If the file name specified ends in a
extension, then the output is compressed with gzip before it is written to the file. The first line of the output, when using the
option, is a header line, which will take one of the two following formats. The second format will be output if the
option is supplied.
#row '\t' column '\t' value
#timestamp '\t' row '\t' column '\t' value
DISPLAY_TIMESTAMPS
The
command displays one cell per line of output. Each line contains three tab delimited fields, row, column, and value. The
option causes the cell timestamp to be included in the output as well. When this option is used, each output line will contain four tab delimited fields in the following order:
timestamp, row, column, value
KEYS_ONLY
The
option suppresses the output of the value. It is somewhat efficient because the option is processed by the RangeServers and not by the client. The value data is not transferred back to the client, only the key data.
NOESCAPE
The output format of a
command comprises tab delimited lines, one cell per line, which is suitable for input to the
command. However, if the value portion of the cell contains either newline or tab characters, then it will confuse the
input parser. To prevent this from happening, newline and tab characters are converted into two character escape sequences, described in the following table.
| Character | Escape Sequence |
| newline \n |
'\' 'n' |
| tab \t |
'\' 't' |
The
option turns off this escaping mechanism.
RETURN_DELETES
The
option is used internally for debugging. When data is deleted from a table, the data is not actually deleted right away. A delete key will get inserted into the database and the delete will get processed and applied during subsequent scans. The
option will return the delete keys in addition to the normal cell keys and values. This option can be useful when used in conjuction with the
option to understand how the delete mechanism works.
Examples
SELECT * FROM test WHERE ('a' <= ROW <= 'e') and
'2008-07-28 00:00:02' < TIMESTAMP < '2008-07-28 00:00:07';
SELECT * FROM test WHERE ROW =^ 'b';
SELECT * FROM test WHERE (ROW = 'a' or ROW = 'c' or ROW = 'g');
SELECT * FROM test WHERE ('a' < ROW <= 'c' or ROW = 'g' or ROW = 'c');
SELECT * FROM test WHERE (ROW < 'c' or ROW > 'd');
SELECT * FROM test WHERE (ROW < 'b' or ROW =^ 'b');
SELECT * FROM test WHERE "farm","tag:abaca" < CELL <= "had","tag:abacinate";
SELECT * FROM test WHERE "farm","tag:abaca" <= CELL <= "had","tag:abacinate";
SELECT * FROM test WHERE CELL = "foo","tag:adactylism";
SELECT * FROM test WHERE CELL =^ "foo","tag:ac";
SELECT * FROM test WHERE CELL =^ "foo","tag:a";
SELECT * FROM test WHERE CELL > "old","tag:abacate";
SELECT * FROM test WHERE CELL >= "old","tag:abacate";
SELECT * FROM test WHERE "old","tag:foo" < CELL >= "old","tag:abacate";
SELECT * FROM test WHERE ( CELL = "maui","tag:abaisance" OR
CELL = "foo","tag:adage" OR
CELL = "cow","tag:Ab" OR
CELL =^ "foo","tag:acya");
Access Hypertable via Django and Python on Apache
Friday, October 2nd, 2009from django.http import HttpResponse
import sys
from hypertable.thriftclient import *
from hyperthrift.gen.ttypes import *
def index(request):
try:
client = ThriftClient(“localhost”, 38080)
print “HQL examples”
res = client.hql_query(“show tables”)
print res
res = client.hql_query(“select * from thrift_test”)
print res
print “mutator examples”;
mutator = client.open_mutator(“thrift_test”, 0, 0);
client.set_cell(mutator, Cell(“py-k1″, “col”, None, “py-v1″))
client.flush_mutator(mutator);
print “scanner examples”;
scanner = client.open_scanner(“thrift_test”,
ScanSpec(None, None, None, 1), True);
while True:
cells = client.next_cells(scanner)
if (len(cells) == 0):
break
print cells
except:
print sys.exc_info()
raise
return HttpResponse(“Hello, Django2.” + repr(res))