英文:
How to connect databricks delta tables from Django application
问题
我是Django框架的新手,在我的项目中,我有一个要求,我需要连接到Databricks Delta表并对其进行CRUD操作。如果我们可以连接它。我请求您列出步骤。谢谢。
英文:
I am new to Django framework, in my project I have requirement that I have to connect to databricks delta tables and perform CRUD operation on it.
If we can connect it. I request you to list down the steps please.Thanks.
答案1
得分: 0
你尝试使用databricks-sql-connector是正确的,但是你在settings.py中的数据库详细信息方面存在混淆,那里不能添加databricks集群数据库。根据文档,以下是Django支持的数据库:
-
PostgreSQL
-
MariaDB
-
MySQL
-
Oracle
-
SQLite
所以在这里的想法是使用databricks-sql-connector从Delta表获取数据,并根据你的应用程序来处理这些数据。
以下是要遵循的步骤:
步骤1:
在你的Django Python环境中安装Python包
https://pypi.org/project/databricks-sql-connector/
pip install databricks-sql-connector
步骤2: 在settings.py中进行配置
保留默认数据库不变。
步骤3:
运行Python迁移。
python .\manage.py makemigrations
步骤4: 在databricks集群中创建一个Delta表
%sql
CREATE TABLE products (
product_name string,
price int)
USING DELTA;
上面是创建表的代码。你可以看到列名与models.py中创建的模型字段相似。
步骤5: 向表中插入数据
insert into products values ('TV', 60000)
insert into products values ('Iphone', 50000)
insert into products values ('Macbook', 150000)
步骤6:
从databricks集群获取主机、http路径和访问令牌。
你可以在Cluster > Configuration > Advanced Options > JDBC/ODBC下找到主机和http路径。
要获取访问令牌,你需要创建一个。进入databricks菜单右侧的下拉菜单 > User settings > 生成新令牌。
步骤7:
现在创建一个视图来获取上面的数据。以下是一个读操作的示例。
from django.shortcuts import render
from databricks import sql
def get(request):
connection = sql.connect(
server_hostname=your_host,
http_path=your_http_path,
access_token=your_access_token
)
cursor = connection.cursor()
cursor.execute('SELECT * FROM `products`')
result = cursor.fetchall()
tmpB = []
for row in result:
dictRow = row.asDict()
tmpB.append(dictRow)
cursor.close()
connection.close()
return render(request, 'details.html', {'product': tmpB})
获取数据后,你可以在模板中呈现它,或者你可以创建模型对象。从步骤8开始,显示如何将检索到的数据存储在Django默认数据库SQLite中。
details.html中的代码
{% block content %}
<h1>Book List</h1>
<ul>
{% for prd in product %}
<li>{{ prd.product_name }} : Price = {{ prd.price}}</li>
{% endfor %}
</ul>
{% endblock %}
结果:这是在页面上显示的结果。
步骤8:
这是将检索到的数据存储在默认数据库中的步骤。在models.py中创建模型。
models.py中的代码
from django.db import models
class Product(models.Model):
product_name = models.CharField(max_length=200)
price = models.IntegerField(default=0)
步骤9:
再次运行Python迁移命令
python .\manage.py makemigrations
这将创建一个名为0001.py的迁移文件,如图所示。
然后运行SQL迁移命令,提供名称如下,该命令将在默认数据库中创建所需的表格。
python .\manage.py sqlmigrate myapp 0001
步骤10:
获取数据并使用它创建模型对象。在views.py中的代码。
from django.shortcuts import render
from .models import Product
from databricks import sql
def get(request):
connection = sql.connect(
server_hostname=host,
http_path=http_path,
access_token=access_token
)
cursor = connection.cursor()
cursor.execute('SELECT * FROM `products`')
result = cursor.fetchall()
for row in result:
dictRow = row.asDict()
dataToSave = Product(product_name=dictRow['product_name'], price=dictRow['price'])
dataToSave.save()
cursor close()
connection.close()
productDT = Product.objects.all()
return render(request, 'details.html', {'product': productDT})
但确保在Delta表和模型字段中使用适当的列名。
同样,你可以执行CRUD操作。以下文档将对此有所帮助。
https://learn.microsoft.com/en-us/azure/databricks/dev-tools/python-sql-connector
英文:
what you tried using databricks-sql-connector was correct,
But you have confusion about the database details in settings.py, there you can’t add databricks cluster database. As per documentation following are the supporting
Database you can use in Django. (https://docs.djangoproject.com/en/4.1/ref/databases/)
-
PostgreSQL
-
MariaDB
-
MySQL
-
Oracle
-
SQLite
So here the idea is get data from delta tables using databricks-sql-connector and handle those data
According to your application.
Below are the steps to follow,
Step 1:
Install python package in your Django python environment
https://pypi.org/project/databricks-sql-connector/
pip install databricks-sql-connector
Step 2: Configurations in setting.py
Leave the default database as it is.
Step 3:
Run python migration.
python .\manage.py makemigrations
Step 4: Create a delta table in databricks cluster
%sql
CREATE TABLE products (
product_name string,
price int)
USING DELTA;
Above is the code for creating table.
Here you can see column names are similar to model fields created in models.py.
Step 5: Insert data into the table
insert into products values ('TV',60000)
insert into products values (Iphone,50000)
insert into products values (Macbook,150000)
Step 6:
Get the host, http_path and access token from databricks cluster.
You can find your host and http path under Cluster > Configuration > Advanced Options > JDBC/ODBC
For access token, you need to create one.
Go to the drop down on right of databricks menu > User settings > Generate new token .
Step 7:
Know create a view to get the above data.
Here is an example for read operation
from django.shortcuts import render
from databricks import sql
def get(request):
connection = sql.connect(
server_hostname= your_host,
http_path= your_http_path,
access_token= your_access_token
cursor = connection.cursor()
cursor.execute('SELECT * FROM `products`')
result = cursor.fetchall()
tmpB=[]
for row in result:
dictRow = row.asDict()
tmpB.append(dictRow)
cursor.close()
connection.close()
return render(request,'details.html',{product:tmpB})
After getting data you can just render it in your template or you can create Model object. From Step 8 shows how to store retrieved data in the Django default database sqlite .
Code in details.html
{% block content %}
<h1>Book List</h1>
<ul>
{% for prd in product %}
<li>{{ prd.product_name }} : Price = {{ prd.price}}</a></li>
{% endfor %}
</ul>
{% endblock %}
Result: Here are the results displayed in the page.
Step 8:
Here the step for storing the retrieved data in default database
Create model in model.py
Code in models.py
from django.db import models
# Create your models here.
class Product(models.Model):
product_name = models.CharField(max_length=200)
price = models.IntegerField(default=0)
Step 9:
Know again run the python migration command
python .\manage.py makemigrations
This will create a migration files with name strating 0001.py as shown in image.
Then run sql migrate command with providing the name as below, this command creates required tables in default database.
python .\manage.py sqlmigrate myapp 0001
Step 10:
Know get the data and create model object using it.
Code in views.py
from django.shortcuts import render
from .models import Product
from databricks import sql
def get(request):
connection = sql.connect(
server_hostname=host,
http_path=http_path,
access_token=access_token)
cursor = connection.cursor()
cursor.execute('SELECT * FROM `products`')
result = cursor.fetchall()
for row in result:
dictRow = row.asDict()
dataTosave = Product(product_name = dictRow['product_name'],price = dictRow['price'])
dataTosave.save()
cursor.close()
connection.close()
productDT = Product.objects.all()
return render(request,'details.html',{product: productDT })
But make sure you use appropriate column names in delta tables and model fields while creating objects.
Similarly, you can execute CRUD operations. Below documentation will help you for it.
https://learn.microsoft.com/en-us/azure/databricks/dev-tools/python-sql-connector
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论