使用多个(python)客户端并行加载cassandra中的所有行_python

概述使用Cassandra推荐的RandomPartitioner(或Murmur3Partitioner)时,无法对键进行有意义的范围查询,因为行是 distributed around the cluster using the md5 hash of the key.这些哈希称为“令牌”. 尽管如此,通过为每个计算工作者分配一个标记范围来分割大表是非常有用的.使用CQL3,它似乎可能到issue 使用Cassandra推荐的RandomPartitioner(或Murmur3Partitioner)时,无法对键进行有意义的范围查询,因为行是 distributed around the cluster using the md5 hash of the key.这些哈希称为“令牌”.

尽管如此,通过为每个计算工作者分配一个标记范围来分割大表是非常有用的.使用CQL3,它似乎可能到issue queries directly against the tokens,但是下面的python不起作用…编辑：在切换到对最新版本的cassandra数据库(doh！)进行测试后工作,并且还更新下面的每个音符的语法：

## use python cql moduleimport cql## If running against an old version of Cassandra,this raises: ## TApplicationException: InvalID method name: 'set_cql_version'conn = cql.connect('localhost',cql_version='3.0.2')cursor = conn.cursor()try:    ## remove the prevIoUs attempt to make this work    cursor.execute('DROP KEYSPACE test;')except Exception,exc:    print exc## make a keyspace and a simple tablecursor.execute("CREATE KEYSPACE test WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor = 1;")cursor.execute("USE test;")cursor.execute('CREATE table data (k int PRIMARY KEY,v varchar);')## put some data in the table -- must use single quotes around literals,not double quotes                                                                                                                                   cursor.execute("INSERT INTO data (k,v) VALUES (0,'a');")cursor.execute("INSERT INTO data (k,v) VALUES (1,'b');")cursor.execute("INSERT INTO data (k,v) VALUES (2,'c');")cursor.execute("INSERT INTO data (k,v) VALUES (3,'d');")## split up the full range of tokens.## Suppose there are 2**k workers:k = 3 # --> eight workerstoken_sub_range = 2**(127 - k)worker_num = 2 # for examplestart_token =    worker_num  * token_sub_rangeend_token = (1 + worker_num) * token_sub_range## put single quotes around the token stringscql3_command = "SELECT k,v FROM data WHERE token(k) >= '%d' AND token(k) < '%d';" % (start_token,end_token)print cql3_command## this fails with "ProgrammingError: Bad Request: line 1:28 no viable alternative at input 'token'"cursor.execute(cql3_command)for row in cursor:    print rowcursor.close()conn.close()

理想情况下,我希望能够使用pycassa,因为我更喜欢它更加pythonic的界面.

有一个更好的方法吗？

解决方法我已更新问题以包含答案. 总结

以上是内存溢出为你收集整理的使用多个(python)客户端并行加载cassandra中的所有行全部内容，希望文章能够帮你解决使用多个(python)客户端并行加载cassandra中的所有行所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错，欢迎将内存溢出网站推荐给程序员好友。

欢迎分享，转载请注明来源：内存溢出

原文地址:https://www.54852.com/langs/1196790.html

使用多个(python)客户端并行加载cassandra中的所有行

发表评论

评论列表（0条）