如何从Django应用程序连接Databricks Delta表

huangapple go评论70阅读模式
英文:

How to connect databricks delta tables from Django application

问题

我是Django框架的新手,在我的项目中,我有一个要求,我需要连接到Databricks Delta表并对其进行CRUD操作。如果我们可以连接它。我请求您列出步骤。谢谢。

英文:

I am new to Django framework, in my project I have requirement that I have to connect to databricks delta tables and perform CRUD operation on it.
If we can connect it. I request you to list down the steps please.Thanks.

答案1

得分: 0

你尝试使用databricks-sql-connector是正确的,但是你在settings.py中的数据库详细信息方面存在混淆,那里不能添加databricks集群数据库。根据文档,以下是Django支持的数据库:

  1. PostgreSQL

  2. MariaDB

  3. MySQL

  4. Oracle

  5. SQLite

所以在这里的想法是使用databricks-sql-connector从Delta表获取数据,并根据你的应用程序来处理这些数据。

以下是要遵循的步骤:

步骤1:
在你的Django Python环境中安装Python包

https://pypi.org/project/databricks-sql-connector/

pip install databricks-sql-connector

步骤2: 在settings.py中进行配置
保留默认数据库不变。

步骤3:
运行Python迁移。

python .\manage.py makemigrations

步骤4: 在databricks集群中创建一个Delta表

%sql
CREATE TABLE products (
product_name string,
price int)
USING DELTA;

上面是创建表的代码。你可以看到列名与models.py中创建的模型字段相似。

步骤5: 向表中插入数据

insert into products values ('TV', 60000)
insert into products values ('Iphone', 50000)
insert into products values ('Macbook', 150000)

步骤6:
从databricks集群获取主机、http路径和访问令牌。

你可以在Cluster > Configuration > Advanced Options > JDBC/ODBC下找到主机和http路径。

要获取访问令牌,你需要创建一个。进入databricks菜单右侧的下拉菜单 > User settings > 生成新令牌。

步骤7:
现在创建一个视图来获取上面的数据。以下是一个读操作的示例。

from django.shortcuts import render
from databricks import sql

def get(request):
    connection = sql.connect(
        server_hostname=your_host,
        http_path=your_http_path,
        access_token=your_access_token
    )
    cursor = connection.cursor()
    cursor.execute('SELECT * FROM `products`')
    result = cursor.fetchall()
    tmpB = []
    for row in result:
        dictRow = row.asDict()
        tmpB.append(dictRow)
    cursor.close()
    connection.close()
    return render(request, 'details.html', {'product': tmpB})

获取数据后,你可以在模板中呈现它,或者你可以创建模型对象。从步骤8开始,显示如何将检索到的数据存储在Django默认数据库SQLite中。

details.html中的代码

{% block content %}
<h1>Book List</h1>
<ul>
{% for prd in product %}
<li>{{ prd.product_name }} : Price = {{ prd.price}}</li>
{% endfor %}
</ul>
{% endblock %}

结果:这是在页面上显示的结果。

步骤8:
这是将检索到的数据存储在默认数据库中的步骤。在models.py中创建模型。

models.py中的代码

from django.db import models

class Product(models.Model):
    product_name = models.CharField(max_length=200)
    price = models.IntegerField(default=0)

步骤9:
再次运行Python迁移命令

python .\manage.py makemigrations

这将创建一个名为0001.py的迁移文件,如图所示。

然后运行SQL迁移命令,提供名称如下,该命令将在默认数据库中创建所需的表格。

python .\manage.py sqlmigrate myapp 0001

步骤10:
获取数据并使用它创建模型对象。在views.py中的代码。

from django.shortcuts import render
from .models import Product
from databricks import sql

def get(request):
    connection = sql.connect(
        server_hostname=host,
        http_path=http_path,
        access_token=access_token
    )
    cursor = connection.cursor()
    cursor.execute('SELECT * FROM `products`')
    result = cursor.fetchall()
    for row in result:
        dictRow = row.asDict()
        dataToSave = Product(product_name=dictRow['product_name'], price=dictRow['price'])
        dataToSave.save()
    cursor close()
    connection.close()
    productDT = Product.objects.all()
    return render(request, 'details.html', {'product': productDT})

但确保在Delta表和模型字段中使用适当的列名。

同样,你可以执行CRUD操作。以下文档将对此有所帮助。

https://learn.microsoft.com/en-us/azure/databricks/dev-tools/python-sql-connector

英文:

what you tried using databricks-sql-connector was correct,
But you have confusion about the database details in settings.py, there you can’t add databricks cluster database. As per documentation following are the supporting

Database you can use in Django. (https://docs.djangoproject.com/en/4.1/ref/databases/)

  1. PostgreSQL

  2. MariaDB

  3. MySQL

  4. Oracle

  5. SQLite

So here the idea is get data from delta tables using databricks-sql-connector and handle those data

According to your application.

Below are the steps to follow,

Step 1:
Install python package in your Django python environment

https://pypi.org/project/databricks-sql-connector/

pip install databricks-sql-connector

Step 2: Configurations in setting.py
Leave the default database as it is.

如何从Django应用程序连接Databricks Delta表

Step 3:
Run python migration.

python .\manage.py makemigrations

Step 4: Create a delta table in databricks cluster

如何从Django应用程序连接Databricks Delta表

%sql
CREATE TABLE products (
product_name string,
price int)
USING DELTA;

Above is the code for creating table.
Here you can see column names are similar to model fields created in models.py.

Step 5: Insert data into the table

insert into products values (&#39;TV&#39;,60000)
insert into products values (Iphone,50000)
insert into products values (Macbook,150000)

Step 6:

Get the host, http_path and access token from databricks cluster.

You can find your host and http path under Cluster > Configuration > Advanced Options > JDBC/ODBC

如何从Django应用程序连接Databricks Delta表

For access token, you need to create one.
Go to the drop down on right of databricks menu > User settings > Generate new token .

Step 7:

Know create a view to get the above data.
Here is an example for read operation

from django.shortcuts import render
from databricks import sql
def get(request):
connection = sql.connect(
server_hostname= your_host,
http_path= your_http_path,
access_token= your_access_token
cursor = connection.cursor()
cursor.execute(&#39;SELECT * FROM `products`&#39;)
result = cursor.fetchall()
tmpB=[]
for row in result:
dictRow = row.asDict()
tmpB.append(dictRow)
cursor.close()
connection.close()
return render(request,&#39;details.html&#39;,{product:tmpB})

After getting data you can just render it in your template or you can create Model object. From Step 8 shows how to store retrieved data in the Django default database sqlite .

Code in details.html

如何从Django应用程序连接Databricks Delta表

{% block content %}
&lt;h1&gt;Book List&lt;/h1&gt;
&lt;ul&gt;
{% for prd in product %}
&lt;li&gt;{{ prd.product_name }} : Price = {{ prd.price}}&lt;/a&gt;&lt;/li&gt;
{% endfor %}
&lt;/ul&gt;
{% endblock %}

Result: Here are the results displayed in the page.

如何从Django应用程序连接Databricks Delta表

如何从Django应用程序连接Databricks Delta表

Step 8:

Here the step for storing the retrieved data in default database
Create model in model.py

Code in models.py

from django.db import models
# Create your models here.
class Product(models.Model):
product_name = models.CharField(max_length=200)
price = models.IntegerField(default=0)

Step 9:
Know again run the python migration command

python .\manage.py makemigrations

This will create a migration files with name strating 0001.py as shown in image.

如何从Django应用程序连接Databricks Delta表

Then run sql migrate command with providing the name as below, this command creates required tables in default database.

python .\manage.py sqlmigrate myapp 0001

如何从Django应用程序连接Databricks Delta表

Step 10:
Know get the data and create model object using it.
Code in views.py

from django.shortcuts import render
from .models import Product
from databricks import sql
def get(request):
connection = sql.connect(
server_hostname=host,
http_path=http_path,
access_token=access_token)
cursor = connection.cursor()
cursor.execute(&#39;SELECT * FROM `products`&#39;)
result = cursor.fetchall()
for row in result:
dictRow = row.asDict()
dataTosave =  Product(product_name = dictRow[&#39;product_name&#39;],price = dictRow[&#39;price&#39;])
dataTosave.save()
cursor.close()
connection.close()
productDT = Product.objects.all()
return render(request,&#39;details.html&#39;,{product: productDT })

But make sure you use appropriate column names in delta tables and model fields while creating objects.

Similarly, you can execute CRUD operations. Below documentation will help you for it.

https://learn.microsoft.com/en-us/azure/databricks/dev-tools/python-sql-connector

huangapple
  • 本文由 发表于 2023年3月7日 19:33:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/75661448.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定