Android Kotlin: 使用Chaquopy运行Python脚本时出现异常

huangapple go评论84阅读模式
英文:

Android Kotlin: Exception running python script with Chaquopy

问题

It looks like you're encountering an issue related to the use of the multiprocessing module in your Python script when running it through Chaquopy in an Android app. The error messages you're seeing indicate that the multiprocessing.dummy module does not have the cpu_count attribute. This is likely because Chaquopy does not fully support the multiprocessing module, especially when it comes to certain functions like cpu_count.

To resolve this issue, you have a few options:

  1. Remove or Refactor the Multiprocessing Code: If your script's functionality doesn't critically depend on multiprocessing, you could refactor it to use normal threading instead of multiprocessing. This would involve replacing multiprocessing.Pool with threading.Thread or threading.ThreadPoolExecutor if possible.

  2. Use a Different Library: If multiprocessing is essential for your script and cannot be easily refactored, you might consider using a different library or approach that is more compatible with Android. Chaquopy may have limitations when it comes to multiprocessing due to the Android environment.

  3. Contact Chaquopy Support: You could reach out to Chaquopy's support or community forums to inquire about possible workarounds or solutions specific to running multiprocessing in an Android environment using Chaquopy.

In any case, it's essential to consider that Android has limitations and restrictions on how multi-threading and multiprocessing can be used due to its sandboxed nature and resource constraints. Depending on the specific use case and requirements, you may need to adjust your approach to work within these limitations.

英文:

As I haven't found an Android library to compare two .wav audio files (only found musicg which is not working for me) I decided to try one of the many I've found for Python, in concrete, AudioCompare.

For that I've followed the chaquopy page instructions and I was able to install v14 with no problems, and now I am able to run Python scripts from my Android app, the problem is the audio compare library I'm trying to run is throwing an exception, that is:

com.chaquo.python.PyException: OSError: This platform lacks a functioning sem_open implementation, therefore, the required synchronization primitives needed will not function, see issue 3770.

I don't know about Python, but I'm quite sure the exception is launched in Matcher.py module (don't know hot to check the line number as the exception is not giving me this information), but anyway I'll paste all files just in case:

main.py:

#!/usr/bin/env python
from error import *
from Matcher import Matcher
from argparse import ArgumentParser

def audio_matcher():
    """Our main control flow."""

    parser = ArgumentParser(
        description="Compare two audio files to determine if one "
                    "was derived from the other. Supports WAVE and MP3.",
        prog="audiomatch")
    parser.add_argument("-f", action="append",
                        required=False, dest="files",
                        default=list(),
                        help="A file to examine.")
    parser.add_argument("-d", action="append",
                        required=False, dest="dirs",
                        default=list(),
                        help="A directory of files to examine. "
                             "Directory must contain only audio files.")

    args = parser.parse_args()

    from os.path import dirname, join
    filename1 = join(dirname(__file__), "file1.wav")
    filename2 = join(dirname(__file__), "file2.wav")

    search_paths = [filename1, filename2]
    #search_paths = args.dirs + args.files

    if len(search_paths) != 2:
        die("Must provide exactly two input files or directories.")

    code = 0
    # Use our matching system
    matcher = Matcher(search_paths[0], search_paths[1])
    results = matcher.match()

    for match in results:
        if not match.success:
            code = 1
            warn(match.message)
        else:
            print(match)

    return code

if __name__ == "__main__":
    exit(audio_matcher())

Matcher.py (from https://github.com/charlesconnell/AudioCompare):

import math
import itertools
from FFT import FFT
import numpy as np
from collections import defaultdict
from InputFile import InputFile
import multiprocessing
#from multiprocessing.dummy import Pool as ThreadPool
import os
import stat
from error import *
from common import *
BUCKET_SIZE = 20
BUCKETS = 4
BITS_PER_NUMBER = int(math.ceil(math.log(BUCKET_SIZE, 2)))
assert((BITS_PER_NUMBER * BUCKETS) <= 32)
NORMAL_CHUNK_SIZE = 1024
NORMAL_SAMPLE_RATE = 44100.0
SCORE_THRESHOLD = 5
class FileResult(BaseResult):
"""The result of fingerprinting
an entire audio file."""
def __init__(self, fingerprints, file_len, filename):
super(FileResult, self).__init__(True, "")
self.fingerprints = fingerprints
self.file_len = file_len
self.filename = filename
def __str__(self):
return self.filename
class ChunkInfo(object):
"""These objects will be the values in
our master hashes that map fingerprints
to instances of this class."""
def __init__(self, chunk_index, filename):
self.chunk_index = chunk_index
self.filename = filename
def __str__(self):
return "Chunk: {c}, File: {f}".format(c=self.chunk_index, f=self.filename)
class MatchResult(BaseResult):
"""The result of comparing two files."""
def __init__(self, file1, file2, file1_len, file2_len, score):
super(MatchResult, self).__init__(True, "")
self.file1 = file1
self.file2 = file2
self.file1_len = file1_len
self.file2_len = file2_len
self.score = score
def __str__(self):
short_file1 = os.path.basename(self.file1)
short_file2 = os.path.basename(self.file2)
if self.score > SCORE_THRESHOLD:
if self.file1_len < self.file2_len:
return "MATCH {f1} {f2} ({s})".format(f1=short_file1, f2=short_file2, s=self.score)
else:
return "MATCH {f2} {f1} ({s})".format(f1=short_file1, f2=short_file2, s=self.score)
else:
return "NO MATCH"
def _to_fingerprints(freq_chunks):
"""Examine the results of running chunks of audio
samples through FFT. For each chunk, look at the frequencies
that are loudest in each "bucket." A bucket is a series of
frequencies. Return the indices of the loudest frequency in each
bucket in each chunk. These indices will be encoded into
a single number per chunk."""
chunks = len(freq_chunks)
fingerprints = np.zeros(chunks, dtype=np.uint32)
# Examine each chunk independently
for chunk in range(chunks):
fingerprint = 0
for bucket in range(BUCKETS):
start_index = bucket * BUCKET_SIZE
end_index = (bucket + 1) * BUCKET_SIZE
bucket_vals = freq_chunks[chunk][start_index:end_index]
max_index = bucket_vals.argmax()
fingerprint += (max_index << (bucket * BITS_PER_NUMBER))
fingerprints[chunk] = fingerprint
# return the indexes of the loudest frequencies
return fingerprints
def _file_fingerprint(filename):
"""Read the samples from the files, run them through FFT,
find the loudest frequencies to use as fingerprints,
turn those into a hash table.
Returns a 2-tuple containing the length
of the file in seconds, and the hash table."""
# Open the file
try:
file = InputFile(filename)
# Read samples from the input files, divide them
# into chunks by time, and convert the samples in each
# chunk into the frequency domain.
# The chunk size is dependent on the sample rate of the
# file. It is important that each chunk represent the
# same amount of time, regardless of the sample
# rate of the file.
chunk_size_adjust_factor = (NORMAL_SAMPLE_RATE / file.get_sample_rate())
fft = FFT(file, int(NORMAL_CHUNK_SIZE / chunk_size_adjust_factor))
series = fft.series()
file_len = file.get_total_samples() / file.get_sample_rate()
file.close()
# Find the indices of the loudest frequencies
# in each "bucket" of frequencies (for every chunk).
# These loud frequencies will become the
# fingerprints that we'll use for matching.
# Each chunk will be reduced to a tuple of
# 4 numbers which are 4 of the loudest frequencies
# in that chunk.
# Convert each tuple in winners to a single
# number. This number is unique for each possible
# tuple. This hopefully makes things more
# efficient.
fingerprints = _to_fingerprints(series)
except Exception as e:
return FileErrorResult(str(e))
return FileResult(fingerprints, file_len, filename)
class Matcher(object):
"""Create an instance of this class to use our matching system."""
def __init__(self, dir1, dir2):
"""The two arguments should be strings that are
file or directory paths. For files, we will simply
examine these files. For directories, we will scan
them for files."""
self.dir1 = dir1
self.dir2 = dir2
@staticmethod
def __search_dir(dir):
"""Returns the regular files residing
in the given directory, OR if the input
is a regular file, return a 1-element
list containing this file. All paths
returned will be absolute paths."""
results = []
# Get the absolute path of our search dir
abs_dir = os.path.abspath(dir)
# Get info about the directory provide
dir_stat = os.stat(abs_dir)
# If it's really a file, just
# return the name of it
if stat.S_ISREG(dir_stat.st_mode):
results.append(abs_dir)
return results
# If it's neither a file nor directory,
# bail out
if not stat.S_ISDIR(dir_stat.st_mode):
die("{d} is not a directory or a regular file.".format(d=abs_dir))
# Scan through the contents of the
# directory (non-recursively).
contents = os.listdir(abs_dir)
for node in contents:
abs_node = abs_dir + os.sep + node
node_stat = os.stat(abs_node)
# If we find a regular file, add
# that to our results list, otherwise
# warn the user.
if stat.S_ISREG(node_stat.st_mode):
results.append(abs_node)
else:
warn("An inode that is not a regular file was found at {f}".format(abs_node))
return results
@staticmethod
def __combine_hashes(files):
"""Take a list of FileResult objects and
create a hash that maps all of their fingerprints
to ChunkInfo objects."""
master = defaultdict(list)
for f in files:
for chunk in range(len(f.fingerprints)):
hash = f.fingerprints[chunk]
master[hash].append(ChunkInfo(chunk, f.filename))
return master
@staticmethod
def __file_lengths(files):
"""Take a list of FileResult objects and
create a hash that maps their filenames
to the length of each file, in seconds."""
results = {}
for f in files:
results[f.filename] = f.file_len
return results
@staticmethod
def __report_file_matches(file, master_hash, file_lengths):
"""Find files from the master hash that match
the given file.
@param file A FileResult object that is our query
@param master_hash The data to search through
@param file_lengths A hash mapping filenames to file lengths
@return A list of MatchResult objects, one for every file
that was represented in master_hash"""
results = []
# A hash that maps filenames to "offset" hashes. Then,
# an offset hash maps the difference in chunk numbers of
# the matches we will find.
# We'll map those differences to the number of matches
# found with that difference.
# This allows us to see if many fingerprints
# from different files occurred at the same
# time offsets relative to each other.
file_match_offsets = {}
for f in file_lengths:
file_match_offsets[f] = defaultdict(lambda: 0)
# For each chunk in the query file
for query_chunk_index in range(len(file.fingerprints)):
# See if that chunk's fingerprint is in our master hash
chunk_fingerprint = file.fingerprints[query_chunk_index]
if chunk_fingerprint in master_hash:
# If it is, record the offset between our query chunk
# and the found chunk
for matching_chunk in master_hash[chunk_fingerprint]:
offset = matching_chunk.chunk_index - query_chunk_index
file_match_offsets[matching_chunk.filename][offset] += 1
# For each file that was in master_hash,
# we examine the offsets of the matching fingerprints we found
for f in file_match_offsets:
offsets = file_match_offsets[f]
# The length of the shorter file is important
# to deciding whether two audio files match.
min_len = min(file_lengths[f], file.file_len)
# max_offset is the highest number of times that two matching
# hash keys were found with the same time difference
# relative to each other.
if len(offsets) != 0:
max_offset = max(offsets.values())
else:
max_offset = 0
# The score is the ratio of max_offset (as explained above)
# to the length of the shorter file. A short file that should
# match another file will result in less matching fingerprints
# than a long file would, so we take this into account. At the
# same time, a long file that should *not* match another file
# will generate a decent number of matching fingerprints by
# pure chance, so this corrects for that as well.
if min_len > 0:
score = max_offset / min_len
else:
score = 0
results.append(MatchResult(file.filename, f, file.file_len, file_lengths[f], score))
return results
def match(self):
"""Takes two AbstractInputFiles as input,
and returns a boolean as output, indicating
if the two files match."""
dir1_files = Matcher.__search_dir(self.dir1)
dir2_files = Matcher.__search_dir(self.dir2)
# Try to determine how many
# processors are in the computer
# we're running on, to determine
# the appropriate amount of parallelism
# to use
try:
cpus = multiprocessing.cpu_count()
except NotImplementedError:
cpus = 1
# Construct a process pool to give the task of
# fingerprinting audio files
pool = multiprocessing.Pool(cpus)
try:
# Get the fingerprints from each input file.
# Do this using a pool of processes in order
# to parallelize the work neatly.
map1_result = pool.map_async(_file_fingerprint, dir1_files)
map2_result = pool.map_async(_file_fingerprint, dir2_files)
# Wait for pool to finish processing
pool.close()
pool.join()
# Get results from process pool
dir1_results = map1_result.get()
dir2_results = map2_result.get()
except KeyboardInterrupt:
pool.terminate()
raise
results = []
# If there was an error in fingerprinting a file,
# add a special ErrorResult to our results list
results.extend([x for x in dir1_results if not x.success])
results.extend([x for x in dir2_results if not x.success])
# Proceed only with fingerprints that were computed
# successfully
dir1_successes = [x for x in dir1_results if x.success and x.file_len > 0]
dir2_successes = [x for x in dir2_results if x.success and x.file_len > 0]
# Empty files should match other empty files
# Our matching algorithm will not report these as a match,
# so we have to make a special case for it.
dir1_empty_files = [x for x in dir1_results if x.success and x.file_len == 0]
dir2_empty_files = [x for x in dir2_results if x.success and x.file_len == 0]
# Every empty file should match every other empty file
for empty_file1, empty_file2 in itertools.product(dir1_empty_files, dir2_empty_files):
results.append(MatchResult(empty_file1.filename, empty_file2.filename, empty_file1.file_len, empty_file2.file_len, SCORE_THRESHOLD + 1))
# This maps filenames to the lengths of the files
dir1_file_lengths = Matcher.__file_lengths(dir1_successes)
dir2_file_lengths = Matcher.__file_lengths(dir2_successes)
# Get the combined sizes of the files in our two search
# paths
dir1_size = sum(dir1_file_lengths.values())
dir2_size = sum(dir2_file_lengths.values())
# Whichever search path has more data in it is the
# one we want to put in the master hash, and then query
# via the other one
if dir1_size < dir2_size:
dir_successes = dir1_successes
master_hash = Matcher.__combine_hashes(dir2_successes)
file_lengths = dir2_file_lengths
else:
dir_successes = dir2_successes
master_hash = Matcher.__combine_hashes(dir1_successes)
file_lengths = dir1_file_lengths
# Loop through each file in the first search path our
# program was given.
for file in dir_successes:
# For each file, check its fingerprints against those in the
# second search path. For matching
# fingerprints, look up the the times (chunk number)
# that the fingerprint occurred
# in each file. Store the time differences in
# offsets. The point of this is to see if there
# are many matching fingerprints at the
# same time difference relative to each
# other. This indicates that the two files
# contain similar audio.
file_matches = Matcher.__report_file_matches(file, master_hash, file_lengths)
results.extend(file_matches)
return results

May be after all the effort it won't work, but at least I'd like to give it a try.

Many help in order to be able to run this matcher script will be much appreciated.

Full exception:

0 = {StackTraceElement@19282} "<python>.java.android.__init__(__init__.py:140)"
1 = {StackTraceElement@19283} "<python>.multiprocessing.synchronize.__init__(synchronize.py:57)"
2 = {StackTraceElement@19284} "<python>.multiprocessing.synchronize.__init__(synchronize.py:162)"
3 = {StackTraceElement@19285} "<python>.multiprocessing.context.Lock(context.py:68)"
4 = {StackTraceElement@19286} "<python>.multiprocessing.queues.__init__(queues.py:336)"
5 = {StackTraceElement@19287} "<python>.multiprocessing.context.SimpleQueue(context.py:113)"
6 = {StackTraceElement@19288} "<python>.multiprocessing.pool._setup_queues(pool.py:343)"
7 = {StackTraceElement@19289} "<python>.multiprocessing.pool.__init__(pool.py:191)"
8 = {StackTraceElement@19290} "<python>.multiprocessing.context.Pool(context.py:119)"
9 = {StackTraceElement@19291} "<python>.Matcher.match(Matcher.py:306)"
10 = {StackTraceElement@19292} "<python>.main.audio_matcher(main.py:38)"
11 = {StackTraceElement@19293} "<python>.chaquopy_java.call(chaquopy_java.pyx:354)"
12 = {StackTraceElement@19294} "<python>.chaquopy_java.Java_com_chaquo_python_PyObject_callAttrThrowsNative(chaquopy_java.pyx:326)"
13 = {StackTraceElement@19295} "com.chaquo.python.PyObject.callAttrThrowsNative(Native Method)"
14 = {StackTraceElement@19296} "com.chaquo.python.PyObject.callAttrThrows(PyObject.java:232)"
15 = {StackTraceElement@19297} "com.chaquo.python.PyObject.callAttr(PyObject.java:221)"
16 = {StackTraceElement@19298} "com.testmepracticetool.toeflsatactexamprep.ui.activities.main.MainActivity.startActivity(MainActivity.kt:104)"
17 = {StackTraceElement@19299} "com.testmepracticetool.toeflsatactexamprep.ui.activities.main.MainActivity.onCreate(MainActivity.kt:80)"
18 = {StackTraceElement@19300} "android.app.Activity.performCreate(Activity.java:7994)"
19 = {StackTraceElement@19301} "android.app.Activity.performCreate(Activity.java:7978)"
20 = {StackTraceElement@19302} "android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1309)"
21 = {StackTraceElement@19303} "android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3422)"
22 = {StackTraceElement@19304} "android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3601)"
23 = {StackTraceElement@19305} "android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:85)"
24 = {StackTraceElement@19306} "android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)"
25 = {StackTraceElement@19307} "android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)"
26 = {StackTraceElement@19308} "android.app.ActivityThread$H.handleMessage(ActivityThread.java:2066)"
27 = {StackTraceElement@19309} "android.os.Handler.dispatchMessage(Handler.java:106)"
28 = {StackTraceElement@19310} "android.os.Looper.loop(Looper.java:223)"
29 = {StackTraceElement@19311} "android.app.ActivityThread.main(ActivityThread.java:7656)"
30 = {StackTraceElement@19312} "java.lang.reflect.Method.invoke(Native Method)"
31 = {StackTraceElement@19313} "com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:592)"
32 = {StackTraceElement@19314} "com.android.internal.os.ZygoteInit.main(ZygoteInit.java:947)"

Edit 1: New exception after @mhsmith great help:

After replacing the import multithreading now I get the following exception:

com.chaquo.python.PyException: AttributeError: module 'multiprocessing.dummy' has no attribute 'cpu_count'
0 = {StackTraceElement@19430} "<python>.Matcher.match(Matcher.py:300)"
1 = {StackTraceElement@19431} "<python>.main.audio_matcher(main.py:38)"
2 = {StackTraceElement@19432} "<python>.chaquopy_java.call(chaquopy_java.pyx:354)"
3 = {StackTraceElement@19433} "<python>.chaquopy_java.Java_com_chaquo_python_PyObject_callAttrThrowsNative(chaquopy_java.pyx:326)"
4 = {StackTraceElement@19434} "com.chaquo.python.PyObject.callAttrThrowsNative(Native Method)"
5 = {StackTraceElement@19435} "com.chaquo.python.PyObject.callAttrThrows(PyObject.java:232)"
6 = {StackTraceElement@19436} "com.chaquo.python.PyObject.callAttr(PyObject.java:221)"
7 = {StackTraceElement@19437} "com.testmepracticetool.toeflsatactexamprep.ui.activities.main.MainActivity.startActivity(MainActivity.kt:104)"
8 = {StackTraceElement@19438} "com.testmepracticetool.toeflsatactexamprep.ui.activities.main.MainActivity.onCreate(MainActivity.kt:80)"
9 = {StackTraceElement@19439} "android.app.Activity.performCreate(Activity.java:7994)"
10 = {StackTraceElement@19440} "android.app.Activity.performCreate(Activity.java:7978)"
11 = {StackTraceElement@19441} "android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1309)"
12 = {StackTraceElement@19442} "android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3422)"
13 = {StackTraceElement@19443} "android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3601)"
14 = {StackTraceElement@19444} "android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:85)"
15 = {StackTraceElement@19445} "android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)"
16 = {StackTraceElement@19446} "android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)"
17 = {StackTraceElement@19447} "android.app.ActivityThread$H.handleMessage(ActivityThread.java:2066)"
18 = {StackTraceElement@19448} "android.os.Handler.dispatchMessage(Handler.java:106)"
19 = {StackTraceElement@19449} "android.os.Looper.loop(Looper.java:223)"
20 = {StackTraceElement@19450} "android.app.ActivityThread.main(ActivityThread.java:7656)"
21 = {StackTraceElement@19451} "java.lang.reflect.Method.invoke(Native Method)"
22 = {StackTraceElement@19452} "com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:592)"
23 = {StackTraceElement@19453} "com.android.internal.os.ZygoteInit.main(ZygoteInit.java:947)"

答案1

得分: 1

> 因为 Android 不支持 POSIX 信号量,大多数 multiprocessing API 会失败并显示错误信息 "This platform lacks a functioning sem_open implementation"。最简单的解决方法是使用 multiprocessing.dummy 代替。

我看到你已经尝试过这样做,但正确的方法是将
import multiprocessing 替换为 import multiprocessing.dummy as multiprocessing


编辑:对于第二个异常,最简单的解决方法是将导入语句改写如下:

from multiprocessing import cpu_count
from multiprocessing.dummy import Pool

然后从文件的其余部分中删除使用这些名称的地方的 multiprocessing. 前缀。

英文:

As it says in the Chaquopy documentation:

> Because Android doesn’t support POSIX semaphores, most of the multiprocessing APIs will fail with the error “This platform lacks a functioning sem_open implementation”. The simplest solution is to use multiprocessing.dummy instead.

I see you've already attempted to do this, but the correct way is to replace
import multiprocessing with import multiprocessing.dummy as multiprocessing.


Edit: for the second exception, the simplest solution is to rewrite the import statements as follows:

from multiprocessing import cpu_count
from multiprocessing.dummy import Pool

And then remove the multiprocessing. prefix from the places where those names are used in the rest of the file.

huangapple
  • 本文由 发表于 2023年6月6日 02:53:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/76409244.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定