OmniSciDB
a5dc49c757
|
Functions | |
def | verify_destinations |
def | get_connection |
def | get_run_vars |
def | get_gpu_info |
def | get_machine_info |
def | read_query_files |
def | read_setup_teardown_query_files |
def | validate_setup_teardown_query_file |
def | validate_query_file |
def | execute_query |
def | calculate_query_times |
def | clear_memory |
def | clear_system_caches |
def | get_mem_usage |
def | run_query |
def | run_setup_teardown_query |
def | json_format_handler |
def | create_results_dataset |
def | send_results_db |
def | send_results_file_json |
def | send_results_jenkins_bench |
def | send_results_output |
def | process_arguments |
def | benchmark |
def run_benchmark.benchmark | ( | input_arguments | ) |
Definition at line 1629 of file run_benchmark.py.
References File_Namespace.append(), create_results_dataset(), get_connection(), get_gpu_info(), get_machine_info(), get_run_vars(), process_arguments(), read_query_files(), read_setup_teardown_query_files(), run_query(), run_setup_teardown_query(), send_results_db(), send_results_file_json(), send_results_jenkins_bench(), send_results_output(), split(), to_string(), and verify_destinations().
def run_benchmark.calculate_query_times | ( | kwargs | ) |
Calculates aggregate query times from all iteration times Kwargs: total_times(list): List of total time calculations execution_times(list): List of execution_time calculations results_iter_times(list): List of results_iter_time calculations connect_times(list): List of connect_time calculations trim(float): Amount to trim from iterations set to gather trimmed values. Enter as deciman corresponding to percent to trim - ex: 0.15 to trim 15%. Returns: query_execution(dict): Query times False(bool): The query failed. Exception should be logged.
Definition at line 525 of file run_benchmark.py.
Referenced by create_results_dataset().
def run_benchmark.clear_memory | ( | kwargs | ) |
Clears CPU or GPU memory Kwargs: con(class 'pymapd.connection.Connection'): Mapd connection mem_type(str): [gpu, cpu] Type of memory to clear Returns: None
Definition at line 603 of file run_benchmark.py.
Referenced by run_query().
def run_benchmark.clear_system_caches | ( | ) |
def run_benchmark.create_results_dataset | ( | kwargs | ) |
Create results dataset Kwargs: run_guid(str): Run GUID run_timestamp(datetime): Run timestamp run_connection(str): Connection string run_machine_name(str): Run machine name run_machine_uname(str): Run machine uname run_driver(str): Run driver run_version(str): Version of DB run_version_short(str): Shortened version of DB label(str): Run label source_db_gpu_count(int): Number of GPUs on run machine source_db_gpu_driver_ver(str): GPU driver version source_db_gpu_name(str): GPU name source_db_gpu_mem(str): Amount of GPU mem on run machine source_table(str): Table to run query against trim(float): Trim decimal to remove from top and bottom of results iterations(int): Number of iterations of each query to run query_group(str): Query group, usually matches table name query_results(dict)::: query_name(str): Name of query query_mapdql(str): Query to run query_id(str): Query ID query_succeeded(bool): Query succeeded query_error_info(str): Query error info result_count(int): Number of results returned initial_iteration_results(dict)::: first_execution_time(float): Execution time for first query iteration first_connect_time(float): Connect time for first query iteration first_results_iter_time(float): Results iteration time for first query iteration first_total_time(float): Total time for first iteration first_cpu_mem_usage(float): CPU memory usage for first query iteration first_gpu_mem_usage(float): GPU memory usage for first query iteration noninitial_iteration_results(list)::: execution_time(float): Time (in ms) that pymapd reports backend spent on query. connect_time(float): Time (in ms) for overhead of query, calculated by subtracting backend execution time from time spent on the execution function. results_iter_time(float): Time (in ms) it took to for pymapd.fetchone() to iterate through all of the results. total_time(float): Time (in ms) from adding all above times. query_total_elapsed_time(int): Total elapsed time for query Returns: results_dataset(list)::: result_dataset(dict): Query results dataset
Definition at line 933 of file run_benchmark.py.
References calculate_query_times().
Referenced by benchmark().
def run_benchmark.execute_query | ( | kwargs | ) |
Executes a query against the connected db using pymapd https://pymapd.readthedocs.io/en/latest/usage.html#querying Kwargs: query_name(str): Name of query query_mapdql(str): Query to run iteration(int): Iteration number con(class): Connection class Returns: query_execution(dict)::: result_count(int): Number of results returned execution_time(float): Time (in ms) that pymapd reports backend spent on query. connect_time(float): Time (in ms) for overhead of query, calculated by subtracting backend execution time from time spent on the execution function. results_iter_time(float): Time (in ms) it took to for pymapd.fetchone() to iterate through all of the results. total_time(float): Time (in ms) from adding all above times. False(bool): The query failed. Exception should be logged.
Definition at line 443 of file run_benchmark.py.
Referenced by run_query().
def run_benchmark.get_connection | ( | kwargs | ) |
Connects to the db using pymapd https://pymapd.readthedocs.io/en/latest/usage.html#connecting Kwargs: db_user(str): DB username db_passwd(str): DB password db_server(str): DB host db_port(int): DB port db_name(str): DB name Returns: con(class): Connection class False(bool): The connection failed. Exception should be logged.
Definition at line 63 of file run_benchmark.py.
Referenced by run_benchmark_arrow.benchmark(), benchmark(), and send_results_db().
def run_benchmark.get_gpu_info | ( | kwargs | ) |
Gets run machine GPU info Kwargs: gpu_name(str): GPU name from input param no_gather_conn_gpu_info(bool): Gather GPU info fields con(class 'pymapd.connection.Connection'): Mapd connection conn_machine_name(str): Name of run machine no_gather_nvml_gpu_info(bool): Do not gather GPU info using nvml gather_nvml_gpu_info(bool): Gather GPU info using nvml gpu_count(int): Number of GPUs on run machine Returns: gpu_info(dict)::: conn_gpu_count(int): Number of GPUs gathered from pymapd con source_db_gpu_count(int): Number of GPUs on run machine source_db_gpu_mem(str): Amount of GPU mem on run machine source_db_gpu_driver_ver(str): GPU driver version source_db_gpu_name(str): GPU name
Definition at line 136 of file run_benchmark.py.
Referenced by run_benchmark_arrow.benchmark(), and benchmark().
def run_benchmark.get_machine_info | ( | kwargs | ) |
Gets run machine GPU info Kwargs: conn_machine_name(str): Name of machine from pymapd con machine_name(str): Name of machine if passed in machine_uname(str): Uname of machine if passed in Returns: machine_info(dict)::: run_machine_name(str): Run machine name run_machine_uname(str): Run machine uname
Definition at line 237 of file run_benchmark.py.
References join().
Referenced by run_benchmark_arrow.benchmark(), and benchmark().
def run_benchmark.get_mem_usage | ( | kwargs | ) |
Calculates memory statistics from mapd_server _client.get_memory call Kwargs: con(class 'pymapd.connection.Connection'): Mapd connection mem_type(str): [gpu, cpu] Type of memory to gather metrics for Returns: ramusage(dict)::: usedram(float): Amount of memory (in MB) used freeram(float): Amount of memory (in MB) free totalallocated(float): Total amount of memory (in MB) allocated errormessage(str): Error if returned by get_memory call rawdata(list): Raw data returned from get_memory call
Definition at line 639 of file run_benchmark.py.
Referenced by run_benchmark_arrow.run_query(), and run_query().
def run_benchmark.get_run_vars | ( | kwargs | ) |
Gets/sets run-specific vars such as time, uid, etc. Kwargs: con(class 'pymapd.connection.Connection'): Mapd connection Returns: run_vars(dict)::: run_guid(str): Run GUID run_timestamp(datetime): Run timestamp run_connection(str): Connection string run_driver(str): Run driver run_version(str): Version of DB run_version_short(str): Shortened version of DB conn_machine_name(str): Name of run machine
Definition at line 95 of file run_benchmark.py.
Referenced by run_benchmark_arrow.benchmark(), and benchmark().
def run_benchmark.json_format_handler | ( | x | ) |
Definition at line 924 of file run_benchmark.py.
def run_benchmark.process_arguments | ( | input_arguments | ) |
Definition at line 1327 of file run_benchmark.py.
Referenced by benchmark().
def run_benchmark.read_query_files | ( | kwargs | ) |
Gets run machine GPU info Kwargs: queries_dir(str): Directory with query files source_table(str): Table to run query against Returns: query_list(dict)::: query_group(str): Query group, usually matches table name queries(list) query(dict)::: name(str): Name of query mapdql(str): Query syntax to run False(bool): Unable to find queries dir
Definition at line 277 of file run_benchmark.py.
References File_Namespace.append(), heavyai.open(), split(), and validate_query_file().
Referenced by run_benchmark_arrow.benchmark(), benchmark(), and read_setup_teardown_query_files().
def run_benchmark.read_setup_teardown_query_files | ( | kwargs | ) |
Get queries to run for setup and teardown from directory Kwargs: queries_dir(str): Directory with query files source_table(str): Table to run query against foreign_table_filename(str): File to create foreign table from Returns: setup_queries(query_list): List of setup queries teardown_queries(query_list): List of teardown queries False(bool): Unable to find queries dir query_list is described by: query_list(dict)::: query_group(str): Query group, usually matches table name queries(list) query(dict)::: name(str): Name of query mapdql(str): Query syntax to run
Definition at line 323 of file run_benchmark.py.
References read_query_files(), and validate_setup_teardown_query_file().
Referenced by benchmark().
def run_benchmark.run_query | ( | kwargs | ) |
Takes query name, syntax, and iteration count and calls the execute_query function for each iteration. Reports total, iteration, and exec timings, memory usage, and failure status. Kwargs: query(dict)::: name(str): Name of query mapdql(str): Query syntax to run iterations(int): Number of iterations of each query to run trim(float): Trim decimal to remove from top and bottom of results con(class 'pymapd.connection.Connection'): Mapd connection clear_all_memory_pre_query(bool,optional): Flag to determine if memory is cleared between query runs Returns: query_results(dict)::: query_name(str): Name of query query_mapdql(str): Query to run query_id(str): Query ID query_succeeded(bool): Query succeeded query_error_info(str): Query error info result_count(int): Number of results returned initial_iteration_results(dict)::: first_execution_time(float): Execution time for first query iteration first_connect_time(float): Connect time for first query iteration first_results_iter_time(float): Results iteration time for first query iteration first_total_time(float): Total time for first iteration first_cpu_mem_usage(float): CPU memory usage for first query iteration first_gpu_mem_usage(float): GPU memory usage for first query iteration noninitial_iteration_results(list)::: execution_time(float): Time (in ms) that pymapd reports backend spent on query. connect_time(float): Time (in ms) for overhead of query, calculated by subtracting backend execution time from time spent on the execution function. results_iter_time(float): Time (in ms) it took to for pymapd.fetchone() to iterate through all of the results. total_time(float): Time (in ms) from adding all above times. query_total_elapsed_time(int): Total elapsed time for query False(bool): The query failed. Exception should be logged.
Definition at line 688 of file run_benchmark.py.
References File_Namespace.append(), clear_memory(), clear_system_caches(), execute_query(), and get_mem_usage().
Referenced by benchmark(), RelAlgExecutor.executeRelAlgQuery(), and run_setup_teardown_query().
def run_benchmark.run_setup_teardown_query | ( | kwargs | ) |
Convenience wrapper around `run_query` to run a setup or teardown query Kwargs: queries(query_list): List of queries to run do_run(bool): If true will run query, otherwise do nothing trim(float): Trim decimal to remove from top and bottom of results con(class 'pymapd.connection.Connection'): Mapd connection Returns: See return value for `run_query` query_list is described by: queries(list) query(dict)::: name(str): Name of query mapdql(str): Query syntax to run [setup : queries(list)] [teardown : queries(list)]
Definition at line 883 of file run_benchmark.py.
References run_query().
Referenced by benchmark().
def run_benchmark.send_results_db | ( | kwargs | ) |
Send results dataset to a database using pymapd Kwargs: results_dataset(list)::: result_dataset(dict): Query results dataset table(str): Results destination table name db_user(str): Results destination user name db_passwd(str): Results destination password db_server(str): Results destination server address db_port(int): Results destination server port db_name(str): Results destination database name table_schema_file(str): Path to destination database schema file Returns: True(bool): Sending results to destination database succeeded False(bool): Sending results to destination database failed. Exception should be logged.
Definition at line 1145 of file run_benchmark.py.
References get_connection(), and heavyai.open().
Referenced by run_benchmark_arrow.benchmark(), and benchmark().
def run_benchmark.send_results_file_json | ( | kwargs | ) |
Send results dataset to a local json file Kwargs: results_dataset_json(str): Json-formatted query results dataset output_file_json (str): Location of .json file output Returns: True(bool): Sending results to json file succeeded False(bool): Sending results to json file failed. Exception should be logged.
Definition at line 1220 of file run_benchmark.py.
References heavyai.open().
Referenced by run_benchmark_arrow.benchmark(), and benchmark().
def run_benchmark.send_results_jenkins_bench | ( | kwargs | ) |
Send results dataset to a local json file formatted for use with jenkins benchmark plugin: https://github.com/jenkinsci/benchmark-plugin Kwargs: results_dataset(list)::: result_dataset(dict): Query results dataset thresholds_name(str): Name to use for Jenkins result field thresholds_field(str): Field to use for query threshold in jenkins output_tag_jenkins(str): Jenkins benchmark result tag, for different sets from same table output_file_jenkins (str): Location of .json jenkins file output Returns: True(bool): Sending results to json file succeeded False(bool): Sending results to json file failed. Exception should be logged.
Definition at line 1246 of file run_benchmark.py.
References heavyai.open().
Referenced by run_benchmark_arrow.benchmark(), and benchmark().
def run_benchmark.send_results_output | ( | kwargs | ) |
Send results dataset script output Kwargs: results_dataset_json(str): Json-formatted query results dataset Returns: True(bool): Sending results to output succeeded
Definition at line 1312 of file run_benchmark.py.
Referenced by run_benchmark_arrow.benchmark(), and benchmark().
def run_benchmark.validate_query_file | ( | kwargs | ) |
Validates query file. Currently only checks the query file name Kwargs: query_filename(str): Name of query file Returns: True(bool): Query succesfully validated False(bool): Query failed validation
Definition at line 421 of file run_benchmark.py.
Referenced by read_query_files().
def run_benchmark.validate_setup_teardown_query_file | ( | kwargs | ) |
Validates query file. Currently only checks the query file name, and checks for setup or teardown in basename Kwargs: query_filename(str): Name of query file check_which(bool): either 'setup' or 'teardown', decide which to check quiet(bool): optional, if True, no warning is logged Returns: True(bool): Query succesfully validated False(bool): Query failed validation
Definition at line 377 of file run_benchmark.py.
Referenced by read_setup_teardown_query_files().
def run_benchmark.verify_destinations | ( | kwargs | ) |
Verify script output destination(s) Kwargs: destinations (list): List of destinations dest_db_server (str): DB output destination server output_file_json (str): Location of .json file output output_file_jenkins (str): Location of .json jenkins file output Returns: True(bool): Destination(s) is/are valid False(bool): Destination(s) is/are not valid
Definition at line 19 of file run_benchmark.py.
Referenced by run_benchmark_arrow.benchmark(), and benchmark().