{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Elliptic Cryptocurrency Transactions\n", "\n", "**Author: Adi Kondepudi**" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "https://www.kaggle.com/datasets/ellipticco/elliptic-data-set?resource=download\n", "\n", "The elliptic data set maps bitcoin transfers between entities, both licit and illicit.\n", "\n", "There are 203,769 nodes and 234,355 edges. Nodes are entities and edges are transactions.\n", "\n", "Among the nodes, 2% (4545) are illicit and 21% (42019) are licit. The rest are unknown.\n", "\n", "There are 166 features for each node.\n", "\n", "The first feature is the time step for that node. Each time step represents a connected component of nodes with edges (transactions) that have all occured within three hours of each other.\n", "\n", "The next 93 features give information about the transactions made by that node (fees, volume, averages, etc).\n", "\n", "The last 72 features are aggregated from adjacent nodes." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Importing general libraries.\n", "# Specific functions and whatnot will be imported seperately in later cells. \n", "\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "import networkx as nx\n", "import graspologic\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | id | \n", "class | \n", "time step | \n", "local_feat_0 | \n", "local_feat_1 | \n", "local_feat_2 | \n", "local_feat_3 | \n", "local_feat_4 | \n", "local_feat_5 | \n", "local_feat_6 | \n", "... | \n", "agg_feat_62 | \n", "agg_feat_63 | \n", "agg_feat_64 | \n", "agg_feat_65 | \n", "agg_feat_66 | \n", "agg_feat_67 | \n", "agg_feat_68 | \n", "agg_feat_69 | \n", "agg_feat_70 | \n", "agg_feat_71 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "230425980 | \n", "unknown | \n", "1 | \n", "-0.171469 | \n", "-0.184668 | \n", "-1.201369 | \n", "-0.121970 | \n", "-0.043875 | \n", "-0.113002 | \n", "-0.061584 | \n", "... | \n", "-0.562153 | \n", "-0.600999 | \n", "1.461330 | \n", "1.461369 | \n", "0.018279 | \n", "-0.087490 | \n", "-0.131155 | \n", "-0.097524 | \n", "-0.120613 | \n", "-0.119792 | \n", "
1 | \n", "5530458 | \n", "unknown | \n", "1 | \n", "-0.171484 | \n", "-0.184668 | \n", "-1.201369 | \n", "-0.121970 | \n", "-0.043875 | \n", "-0.113002 | \n", "-0.061584 | \n", "... | \n", "0.947382 | \n", "0.673103 | \n", "-0.979074 | \n", "-0.978556 | \n", "0.018279 | \n", "-0.087490 | \n", "-0.131155 | \n", "-0.097524 | \n", "-0.120613 | \n", "-0.119792 | \n", "
2 | \n", "232022460 | \n", "unknown | \n", "1 | \n", "-0.172107 | \n", "-0.184668 | \n", "-1.201369 | \n", "-0.121970 | \n", "-0.043875 | \n", "-0.113002 | \n", "-0.061584 | \n", "... | \n", "0.670883 | \n", "0.439728 | \n", "-0.979074 | \n", "-0.978556 | \n", "-0.098889 | \n", "-0.106715 | \n", "-0.131155 | \n", "-0.183671 | \n", "-0.120613 | \n", "-0.119792 | \n", "
3 | \n", "232438397 | \n", "licit | \n", "1 | \n", "0.163054 | \n", "1.963790 | \n", "-0.646376 | \n", "12.409294 | \n", "-0.063725 | \n", "9.782742 | \n", "12.414558 | \n", "... | \n", "-0.577099 | \n", "-0.613614 | \n", "0.241128 | \n", "0.241406 | \n", "1.072793 | \n", "0.085530 | \n", "-0.131155 | \n", "0.677799 | \n", "-0.120613 | \n", "-0.119792 | \n", "
4 | \n", "230460314 | \n", "unknown | \n", "1 | \n", "1.011523 | \n", "-0.081127 | \n", "-1.201369 | \n", "1.153668 | \n", "0.333276 | \n", "1.312656 | \n", "-0.061584 | \n", "... | \n", "-0.511871 | \n", "-0.400422 | \n", "0.517257 | \n", "0.579382 | \n", "0.018279 | \n", "0.277775 | \n", "0.326394 | \n", "1.293750 | \n", "0.178136 | \n", "0.179117 | \n", "
5 rows × 168 columns
\n", "\n", " | time step | \n", "class | \n", "count | \n", "
---|---|---|---|
93 | \n", "32 | \n", "illicit | \n", "342 | \n", "
84 | \n", "29 | \n", "illicit | \n", "329 | \n", "
36 | \n", "13 | \n", "illicit | \n", "291 | \n", "
57 | \n", "20 | \n", "illicit | \n", "260 | \n", "
24 | \n", "9 | \n", "illicit | \n", "248 | \n", "
123 | \n", "42 | \n", "illicit | \n", "239 | \n", "
102 | \n", "35 | \n", "illicit | \n", "182 | \n", "
63 | \n", "22 | \n", "illicit | \n", "158 | \n", "
42 | \n", "15 | \n", "illicit | \n", "147 | \n", "
69 | \n", "24 | \n", "illicit | \n", "137 | \n", "
30 | \n", "11 | \n", "illicit | \n", "131 | \n", "
45 | \n", "16 | \n", "illicit | \n", "128 | \n", "
72 | \n", "25 | \n", "illicit | \n", "118 | \n", "
120 | \n", "41 | \n", "illicit | \n", "116 | \n", "
117 | \n", "40 | \n", "illicit | \n", "112 | \n", "
111 | \n", "38 | \n", "illicit | \n", "111 | \n", "
90 | \n", "31 | \n", "illicit | \n", "106 | \n", "
18 | \n", "7 | \n", "illicit | \n", "102 | \n", "
60 | \n", "21 | \n", "illicit | \n", "100 | \n", "
48 | \n", "17 | \n", "illicit | \n", "99 | \n", "
75 | \n", "26 | \n", "illicit | \n", "96 | \n", "
81 | \n", "28 | \n", "illicit | \n", "85 | \n", "
87 | \n", "30 | \n", "illicit | \n", "83 | \n", "
114 | \n", "39 | \n", "illicit | \n", "81 | \n", "
54 | \n", "19 | \n", "illicit | \n", "80 | \n", "
21 | \n", "8 | \n", "illicit | \n", "67 | \n", "
144 | \n", "49 | \n", "illicit | \n", "56 | \n", "
66 | \n", "23 | \n", "illicit | \n", "53 | \n", "
51 | \n", "18 | \n", "illicit | \n", "52 | \n", "
39 | \n", "14 | \n", "illicit | \n", "43 | \n", "
108 | \n", "37 | \n", "illicit | \n", "40 | \n", "
99 | \n", "34 | \n", "illicit | \n", "37 | \n", "
141 | \n", "48 | \n", "illicit | \n", "36 | \n", "
105 | \n", "36 | \n", "illicit | \n", "33 | \n", "
9 | \n", "4 | \n", "illicit | \n", "30 | \n", "
78 | \n", "27 | \n", "illicit | \n", "24 | \n", "
126 | \n", "43 | \n", "illicit | \n", "24 | \n", "
129 | \n", "44 | \n", "illicit | \n", "24 | \n", "
96 | \n", "33 | \n", "illicit | \n", "23 | \n", "
138 | \n", "47 | \n", "illicit | \n", "22 | \n", "
27 | \n", "10 | \n", "illicit | \n", "18 | \n", "
3 | \n", "2 | \n", "illicit | \n", "18 | \n", "
0 | \n", "1 | \n", "illicit | \n", "17 | \n", "
33 | \n", "12 | \n", "illicit | \n", "16 | \n", "
6 | \n", "3 | \n", "illicit | \n", "11 | \n", "
12 | \n", "5 | \n", "illicit | \n", "8 | \n", "
132 | \n", "45 | \n", "illicit | \n", "5 | \n", "
15 | \n", "6 | \n", "illicit | \n", "5 | \n", "
135 | \n", "46 | \n", "illicit | \n", "2 | \n", "
\n", " | community | \n", "strength | \n", "rank_strength | \n", "x | \n", "y | \n", "
---|---|---|---|---|---|
node_id | \n", "\n", " | \n", " | \n", " | \n", " | \n", " |
294468623 | \n", "1 | \n", "1 | \n", "1.0 | \n", "-7.083713 | \n", "6.587129 | \n", "
294326756 | \n", "2 | \n", "1 | \n", "1.0 | \n", "11.408134 | \n", "14.842425 | \n", "
294370619 | \n", "22 | \n", "2 | \n", "2.0 | \n", "7.499983 | \n", "10.402857 | \n", "
294372200 | \n", "1 | \n", "1 | \n", "1.0 | \n", "-7.002121 | \n", "6.867332 | \n", "
294324191 | \n", "1 | \n", "1 | \n", "1.0 | \n", "-7.139164 | \n", "7.864409 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
1891081 | \n", "1 | \n", "95 | \n", "25.0 | \n", "8.650565 | \n", "13.803895 | \n", "
294374011 | \n", "1 | \n", "1 | \n", "1.0 | \n", "-6.657413 | \n", "6.512280 | \n", "
294300722 | \n", "22 | \n", "1 | \n", "1.0 | \n", "6.279359 | \n", "2.508360 | \n", "
294375201 | \n", "1 | \n", "1 | \n", "1.0 | \n", "-6.410471 | \n", "7.060765 | \n", "
1757629 | \n", "22 | \n", "1 | \n", "1.0 | \n", "8.746396 | \n", "6.634493 | \n", "
228 rows × 5 columns
\n", "