vaios-stergio's picture
Upload 11 files
fbdd31a verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:46900
  - loss:CachedMultipleNegativesRankingLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
  - source_sentence: >-
      Recurrent Neural Network Based MultiRobot Route Planning for SteepLand
      Harvesting Systems.Terrains with steep inclines cannot be utilized for
      crop production as it is hazardous to operate conventional farm equipment
      for various cropping tasks. This research which is a part of a larger
      effort to deploy a team of robots for seeding and other similar tasks
      proposes an approach for route planning of a team of ground mobile robots
      drones for seed delivery as well as an unmanned aerial vehicle UAV to
      continually replenish the drones with seeds. The overall cost is
      formulated in terms of linearly constrained integer quadratic programming
      LCICQ. Using a Lagrangian function a recurrent neural network with primal
      and dual sets of neurons is proposed to converge to optimal solutions.
    sentences:
      - >-
        Neural Path Planning Fixed Time NearOptimal Path Generation via Oracle
        Imitation.Fast and efficient path generation is critical for robots
        operating in complex environments. This motion planning problem is often
        performed in a robots actuation or configuration space where popular
        pathfinding methods such as A RRT get exponentially more computationally
        expensive to execute as the dimensionality increases or the spaces
        become more cluttered and complex. On the other hand if one were to save
        the entire set of paths connecting all pair of locations in the
        configuration space a priori one would run out of memory very quickly.
        In this work we introduce a novel way of producing fast and optimal
        motion plans for static environments by using a stepping neural network
        approach called OracleNet. OracleNet uses Recurrent Neural Networks to
        determine endtoend trajectories in an iterative manner that implicitly
        generates optimal motion plans with minimal loss in performance in a
        compact form. The algorithm is straightforward in implementation while
        consistently generating nearoptimal paths in a single iterative endtoend
        rollout. In practice OracleNet generally has fixedtime execution
        regardless of the configuration space complexity while outperforming
        popular pathfinding algorithms in complex environments and higher
        dimensions1.
      - >-
        A Critical Look at the 2019 College Admissions Scandal.Discusses the
        2019 College admissions scandal. Let me begin with a disclaimer I am
        making no legal excuses for the participants in the current scandal. I
        am only offering contextual background that places it in the broader
        academic cultural and political perspective required for understanding.
        It is only the most recent installment of a wellworn narrative the
        controlling elite make their own rules and live by them if they can get
        away with it. Unfortunately some of the participants who are either
        serving or facing jail time didnt know to not go into a gunfight with a
        sharp stick. Money alone is not enough to avoid prosecution for fraud
        you need political clout. The best protection a defendant can have is a
        prosecutor who fears political reprisal. Compare how the Koch brothers
        escaped prosecution for stealing millions of oil dollars from Native
        American tribes12 with the fate of actresses Lori Loughlin and Felicity
        Huffman who at the time of this writing face jail time for paying bribes
        to get their children into good universities.34 In the former case the
        federal prosecutor who dared to empanel a grand jury to get at the truth
        was fired for cause which put a quick end to the prosecution. In the
        latter case the prosecutors pushed for jail terms and public
        admonishment with the zeal of Oliver Cromwell. There you have it
        stealing oil from Native Americans versus trying to bribe your kids into
        a great university. Where is the greater crime Admittedly these
        actresses and their
      - >-
        Sensitivity Enhanced Photoacoustic Imaging Using a HighFrequency PZT
        Transducer with an Integrated FrontEnd Amplifier.Photoacoustic PA
        imaging is a hybrid imaging technique that can provide both structural
        and functional information of biological tissues. Due to limited
        permissible laser energy deposited on tissues highly sensitive PA
        imaging is required. Here we developed a 20 MHz lead zirconium titanate
        PZT transducer 1.5 mm  3 mm with frontend amplifier circuits for local
        signal processing to achieve sensitivity enhanced PA imaging. The
        electrical and acoustic performance was characterized. Experiments on
        phantoms and chicken breast tissue were conducted to validate the
        imaging performance. The fabricated prototype shows a bandwidth of 63
        and achieves a noise equivalent pressure NEP of 0.24 mPaHz and a
        receiving sensitivity of 62.1 μVPa at 20 MHz without degradation of the
        bandwidth. PA imaging of wire phantoms demonstrates that the prototype
        is capable of improving the detection sensitivity by 10 dB compared with
        the traditional transducer without integrated amplifier. In addition in
        vitro experiments on chicken breast tissue show that structures could be
        imaged with enhanced contrast using the prototype and the imaging depth
        range was improved by 1 mm. These results demonstrate that the
        transducer with an integrated frontend amplifier enables highly
        sensitive PA imaging with improved penetration depth. The proposed
        method holds the potential for visualization of deep tissue structures
        and enhanced detection of weak physiological changes.
  - source_sentence: >-
      A method for reconstructing label images from a few projections as
      motivated by electron microscopy.Our aim is to produce a tessellation of
      space into small voxels and based on only a few tomographic projections of
      an object assign to each voxel a label that indicates one of the
      components of interest constituting the object. Traditional methods are
      not reliable in applications such as electron microscopy in which due to
      the damage by radiation only a few projections are available. We postulate
      a low level prior knowledge regarding the underlying distribution of label
      images and then directly estimate the label image based on the prior and
      the projections. We use a relatively efficient approximation to a global
      search for the optimal estimate. Copyright Springer Science Business Media
      LLC 2006
    sentences:
      - >-
        Airline Miles Redemption.The business of Airline firms has deviated from
        their main business of flying passengers over the past decade. Now they
        have diversified into other lines of business as well. The revenue model
        has therefore changed over the past decade. Airline miles is one of the
        main revenue generating venture for airlines currently. It has been
        mentioned that airline miles business has turned cash cow for these
        firms. Their normal way of business flying passengers is not an
        attractive method of running business for them. Selling airline miles
        allow them to generate a higher revenue. We look at how the redemption
        of airline miles affect the bottom line of the company using data from
        publicly available data sources
      - >-
        Extensive Examination of XOR Arbiter PUFs as Security Primitives for
        ResourceConstrained IoT Devices.Communication security is essential for
        the proper functioning of the Internet of Things. Traditional approaches
        that rely on cryptographic keys are vulnerable to sidechannel attacks.
        Physical Unclonable Functions PUFs leveraging unavoidable and
        irreproducible variations of integrated circuits to produce responses
        unique for individual PUF devices are emerging as promising candidates
        as security primitives to provide keyless solutions. Before a PUF can be
        adopted for real applications the PUF must be thoroughly examined to
        understand its various properties for its application feasibility. In
        this paper we study XOR PUFs for broad ranges of values for circuit
        architecture parameters. XOR PUFs have been extensively studied and have
        been shown to be unable to withstand machine learning attacks for 64bit
        XOR PUFs with less than ten component PUFs. Attack methods employed in
        existing studies need a large number of challengeresponse pairs CRPs
        which are obtainable only if the PUF has an open access interface. When
        PUFembedded devices equipped with mutual authentication or response
        obfuscating techniques it is difficult for attackers to accumulate large
        numbers of CRPs. With only a small number of accumulated CRPs available
        to attackers small size PUFs like XOR PUFs with a small number of
        component PUFs and stages may become resistant to machine learning
        attacks. Since smaller sizes mean less resourcedemanding it is
        worthwhile to examine such PUFs which have usually been considered
        unsafe against attacks. Such are thoughts that have been motivating us
        in this paper to explore the PUF performances for a wide range of values
        of the PUF architecture parameters.
      - >-
        LocalConvexity Reinforcement for Scene Reconstruction from Sparse Point
        Clouds.Several methods reconstruct surfaces from sparse point clouds
        that are estimated from images. Most of them build 3D Delaunay
        triangulation of the points and compute occupancy labeling of the
        tetrahedra thanks to visibility information and surface constraints.
        However their most notable errors are falselylabeled freespace
        tetrahedra. We present labeling corrections of these errors based on a
        new shape constraint localconvexity. In the simplest case this means
        that a freespace tetrahedron of the Delaunay is relabeled matter if its
        size is small enough and all its vertices are in matter tetrahedra. The
        allowed corrections are more important in the vertical direction than in
        the horizontal ones to take into account the anisotropy of usual scenes.
        In the experiments our corrections improve the results of previous
        surface reconstruction methods applied to videos taken by a consumer 360
        camera.
  - source_sentence: >-
      Buzzer Detection to Maintain Information Neutrality in 2019 Indonesia
      Presidential Election.This paper proposed a method which detects a
      political buzzer in social media specifically Instagram. With Indonesia
      undergoing 2019 presidential election a detection of buzzers that causes
      much trouble in maintaining information neutrality is seen as a needed.
      One of the many reasons is because those buzzers spread false news making
      the information gained by the use of social media to be not neutral and
      deliberately offends or attack those that they are not in favor of. Those
      buzzers share a similar characteristic tendency or even possess the same
      pattern. Grouping classification and detection method are used to counter
      this problem. This research gives a slight overview of what is happening
      in social media and a theory of how to deal with those problems. The
      argument is expected to help to identify buzzer in real life thus helps in
      maintaining information neutrality along with the social media in
      Indonesia.
    sentences:
      - >-
        Fake News Detection on Social Media A Systematic Survey.These days there
        are instabilities in many societies in the world either because of
        political economic and other societal issues. The advance in mobile
        technology has enabled social media to play a vital role in organizing
        activities in favour or against certain parties or countries. Many
        researchers see the need to develop automated systems that are capable
        of detecting and tracking fake news on social media. In this paper we
        introduce a systematic survey on the process of fake news detection on
        social media. The types of data and the categories of features used in
        the detection model as well as benchmark datasets are discussed.
      - >-
        Automatic Guided Waves Data Transmission System Using an Oil Industry
        Multiwire Cable.Alternative wireless data communication systems are a
        necessity in industries that operate in harsh environments such as the
        oil and gas industry. Ultrasonic guided wave propagation through solid
        metallic structures such as metal barriers rods and multiwire cables
        have been proposed for data transmission purposes. In this context
        multiwire cables have been explored as a communication media for the
        transmission of encoded ultrasonic guided waves. This work presents the
        proprietary hardware design and implementation of an automatic data
        transmission system based on the propagation of ultrasonic guided waves
        using as communication channels a hightemperature and corrosionresistant
        oil industry multiwire cable. A dedicated communication protocol has
        been implemented at physical and data link layers which involved pulse
        position modulation PPM digital signal processing DSP and an integrity
        validation byte. The data transmission system was composed of an
        ultrasonic guided waves PPM encoded data transmitter a 1K22 MP35N
        multiwire cable a hardware preamplifier a data acquisition module a
        realtime RT DSP LabVIEW National Instruments Austin TX based demodulator
        and a humanmachine interface HMI running on a personal computer. To
        evaluate the communication system the transmitter generated 60 kHz PPM
        energy packets containing three different bytes and their corresponding
        integrity validation bytes. Experimental tests were conducted in the
        laboratory using 1 and 10 m length cables. Although a dispersive solid
        elastic media was used as a communication channel results showed that
        digital data transmission rates up to 470 bps were effectively
        validated.
      - >-
        Improved Optimization of Motion Primitives for Motion Planning in State
        Lattices.In this paper we propose a framework for generating motion
        primitives for latticebased motion planners automatically. Given a
        family of systems the user only needs to specify which principle types
        of motions which are here denoted maneuvers that are relevant for the
        considered system family. Based on the selected maneuver types and a
        selected system instance the algorithm not only automatically optimizes
        the motions connecting predefined boundary conditions but also
        simultaneously optimizes the endpoint boundary conditions as well. This
        significantly reduces the time consuming part of manually specifying all
        boundary value problems that should be solved and no exhaustive search
        to generate feasible motions is required. In addition to handling static
        a priori known system parameters the framework also allows for fast
        automatic reoptimization of motion primitives if the system parameters
        change while the system is in use e.g if the load significantly changes
        or a trailer with a new geometry is picked up by an autonomous truck. We
        also show in several numerical examples that the framework can enhance
        the performance of the motion planner in terms of total cost for the
        produced solution.
  - source_sentence: >-
      Exploring corporate governance research in accounting journals through
      latent semantic and topic analyses.The literature on corporate governance
      CG has been expanding at an unprecedented rate since major corporate
      scandals surfaced such as Enron WorldCom and HealthSouth. Corresponding
      with accountingu0027s important role in CG accounting scholars
      increasingly have investigated CG in recent years so the body of
      literature is growing. Although previous attempts have been made to
      summarize extant literature on CG via reviews none of these attempts has
      utilized recent developments in text analyses and natural language
      processing. This study uses latent semantic and topic analyses to address
      this research gap by analysing abstracts from 1399 articles in all
      accounting journals that the Australian Business Deans Council ABDC has
      rated A and A. The ABDC journal list is widely recognized as a
      journalquality indicator across many universities worldwide. The analyses
      revealed 10 distinct research topics on CG in the ABDCu0027s top
      accounting journals. The results presented include the five most
      representative articles for each topic as distinguished by topic scores.
      This study carries important practice and policy implications as it
      reveals major research streams and exhibits how researchers respond to
      various CG problems.
    sentences:
      - >-
        Performance Analysis of Small Cells Deployment under Imperfect Traffic
        Hotspot Localization.Heterogeneous Networks HetNets long been considered
        in operatorsu0027 roadmaps for macrocellsu0027 network improvements
        still continue to attract interest for 5G network deployments.
        Understanding the efficiency of small cell deployment in the presence of
        traffic hotspots can further draw operatorsu0027 attention to this
        feature. In this context we evaluate the impact of imperfect small cell
        positioning on the network performances. We show that the latter is
        mainly impacted by the position of the hotspot within the cell in case
        the hotspot is near the macrocell even a perfect positioning of the
        small cell will not yield improved performance due to the interference
        coming from the macrocell. In the case where the hotspot is located far
        enough from the macrocell even a large error in small cell positioning
        would still be beneficial in offloading traffic from the congested
        macrocell.
      - >-
        Corporate disclosure via social media a data science approach.The
        purpose of this paper is to investigate corporate financial disclosure
        via Twitter among the top listed 350 companies in the UK as well as
        identify the determinants of the extent of social media usage to
        disclose financial information.This study applies an unsupervised
        machine learning technique namely Latent Dirichlet Allocation topic
        modeling to identify financial disclosure tweets. Panel Logistic and
        Generalized Linear Model Regressions are also run to identify the
        determinants of financial disclosure on Twitter focusing mainly on board
        characteristics.Topic modeling results reveal that companies mainly
        tweet about 12 topics including financial disclosure which has a
        probability of occurrence of about 7 percent. Several board
        characteristics are found to be associated with the extent of Twitter
        usage as a financial disclosure platform among which are board
        independence gender diversity and board tenure.The extensive literature
        examines disclosure via traditional media and its determinants yet this
        paper extends the literature by investigating the relatively new
        disclosure channel of social media. This study is among the first to
        utilize machine learning instead of manual coding techniques to
        automatically unveil the tweets topics and reveal financial disclosure
        tweets. It is also among the first to investigate the relationships
        between several board characteristics and financial disclosure on
        Twitter providing a distinction between the roles of executive vs
        nonexecutive directors relating to disclosure decisions.
      - >-
        Feasibility of Replacing the Range Doppler Equation of Spaceborne
        Synthetic Aperture Radar Considering Atmospheric Propagation Delay with
        a Rational Polynomial Coefficient Model.Usually the rational polynomial
        coefficient RPC model of spaceborne synthetic aperture radar SAR is
        fitted by the original range Doppler RD model. However the radar signal
        is affected by twoway atmospheric delay which causes measurement error
        in the slant range term of the RD model. In this paper two atmospheric
        delay correction methods are proposed for use in terrainindependent RPC
        fitting singlescene SAR imaging with a unique atmospheric delay
        correction parameter plan 1 and singlescene SAR imaging with spatially
        varying atmospheric delay correction parameters plan 2. The feasibility
        of the two methods was verified by conducting fitting experiments and
        geometric positioning accuracy verification of the RPC model. The
        experiments for the GF3 satellite were performed by using global
        meteorological data a global digital elevation model and ground control
        data from several regions in China. The experimental results show that
        it is feasible to use plan 1 or plan 2 to correct the atmospheric delay
        error no matter whether in plain mountainous or plateau areas. Moreover
        the geometric positioning accuracy of the RPC model after correcting the
        atmospheric delay was improved to better than 3 m. This is of great
        significance for the efficient and highprecision geometric processing of
        spaceborne SAR images.
  - source_sentence: >-
      DCNN and LDARFRFE Based ShortTerm Electricity Load and Price
      Forecasting.In this paper Deep Convolutional Neural Network DCNN is
      proposed for short term electricity load and price forecasting. Extracting
      useful information from data and then using that information for
      prediction is a challenging task. This paper presents a model consisting
      of two stages feature engineering and prediction. Feature engineering
      comprises of Feature Extraction FE and Feature Selection FS. For FS this
      paper proposes a technique that is combination of Random Forest RF and
      Recursive Feature Elimination RFE. The proposed technique is used for
      feature redundancy removal and dimensionality reduction. After finding the
      useful features DCNN is used for electricity price and load forecasting.
      DCNN performance is compared with Convolutional Neural Network CNN and
      Support Vector Classifier SVC models. Using the forecasting models
      dayahead and the week ahead forecasting is done for electricity price and
      load. To evaluate the CNN SVC and DCNN models real electricity market data
      is used. Mean Absolute Error MAE and Root Mean Square Error RMSE are used
      to evaluate the performance of the models. DCNN outperforms compared
      models by yielding lesser errors.
    sentences:
      - >-
        Stable twosided satisfied matching for ridesharing system based on
        preference orders.Ridesharing has emerged as an alternative
        transportation mode along road networks around the world. Rideshare
        matching problem is vital to improve the sustainable development of
        ridesharing systems. This paper aims to address the stable twosided
        satisfied matching problem considering the participants psychological
        perception. First of all we investigate the elements that influence
        passengers and drivers ridesharing experience by means of semistructured
        research interviews and questionnaire survey. Two ridesharing perception
        evaluation systems are originally established to get the preference
        orders of passengers and drivers separately. Then rideshare
        matchingrelated definitions are stated and rideshare matching linguistic
        information processing is also elaborated in detail based on preference
        utility function disappointment function as well as elation function.
        Furthermore we propose a stable twosided satisfied matching model on
        account of fuzzy linguistic information processing about ridesharing
        which is able to reflect participants psychological factors. To verify
        the validity of our model we present a twosided matching case based on
        hypothetical rideshare matching platform. The analytical results
        indicate that the use of stable twosided satisfied matching method based
        on fuzzy linguistic information enables to substantially satisfy both
        drivers and passengers expectation and improve the sustainability of
        ridesharing systems.
      - >-
        ShortTerm Electricity Load and Price Forecasting using Enhanced KNN.In
        this paper we introduced a new enhanced technique to resolve the issue
        of electricity price and load forecasting. In Smart Grids SGs Price and
        load forecasting is the major issue. Framework of enhanced technique
        comprises of classification and feature engineering. Feature engineering
        comprises of feature selection and feature extraction. Decision Tree
        Regression DTR is used for feature selection. Recursive Feature
        Elimination RFE is used for feature selection which eliminates the
        redundancy of features. The second step of feature engineering feature
        extraction is done using Singular Value Decomposition SVD which reduces
        the dimensionality of features. Last step is to predict the load and
        forecast. For forecasting electricity load and price two existing
        techniques KNearest Neighbors KNN and MultiLayer Perceptron MLP and a
        newly proposed technique known as Enhanced KNN EKNN is being used. The
        proposed technique outperforms than MLP and KNN in terms of accuracy.
        KNN is working on nonparametric method which is used for classification
        and regression.
      - >-
        Death Ground.Death Ground is a competitive musical installationgame for
        two players. The work is designed to provide the framework for the
        playersparticipants in which to perform gamemediated musical gestures
        against eachother. The main mechanic involves destroying the other
        playeru0027s avatar by outmaneuvering and using audio weapons and
        improvised musical actions against it. These weapons are spawned in an
        enclosed area during the performance and can be used by whoever is
        collects them first. There is a multitude of such powerups all of which
        have different properties such as speed boost additional damage ground
        traps and so on. All of these weapons affect the sound and sonic
        textures that each of the avatars produce. Additionally the players can
        use elements of the environment such as platforms obstructions and
        elevation in order to gain competitive advantage or position themselves
        strategically to access first the spawned powerups.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: dblp aminer 50k dev
          type: dblp-aminer-50k-dev
        metrics:
          - type: cosine_accuracy
            value: 1
            name: Cosine Accuracy
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: dblp aminer 50k test
          type: dblp-aminer-50k-test
        metrics:
          - type: cosine_accuracy
            value: 1
            name: Cosine Accuracy

SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2 on the parquet dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • parquet

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False, 'architecture': 'MPNetModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'DCNN and LDARFRFE Based ShortTerm Electricity Load and Price Forecasting.In this paper Deep Convolutional Neural Network DCNN is proposed for short term electricity load and price forecasting. Extracting useful information from data and then using that information for prediction is a challenging task. This paper presents a model consisting of two stages feature engineering and prediction. Feature engineering comprises of Feature Extraction FE and Feature Selection FS. For FS this paper proposes a technique that is combination of Random Forest RF and Recursive Feature Elimination RFE. The proposed technique is used for feature redundancy removal and dimensionality reduction. After finding the useful features DCNN is used for electricity price and load forecasting. DCNN performance is compared with Convolutional Neural Network CNN and Support Vector Classifier SVC models. Using the forecasting models dayahead and the week ahead forecasting is done for electricity price and load. To evaluate the CNN SVC and DCNN models real electricity market data is used. Mean Absolute Error MAE and Root Mean Square Error RMSE are used to evaluate the performance of the models. DCNN outperforms compared models by yielding lesser errors.',
    'ShortTerm Electricity Load and Price Forecasting using Enhanced KNN.In this paper we introduced a new enhanced technique to resolve the issue of electricity price and load forecasting. In Smart Grids SGs Price and load forecasting is the major issue. Framework of enhanced technique comprises of classification and feature engineering. Feature engineering comprises of feature selection and feature extraction. Decision Tree Regression DTR is used for feature selection. Recursive Feature Elimination RFE is used for feature selection which eliminates the redundancy of features. The second step of feature engineering feature extraction is done using Singular Value Decomposition SVD which reduces the dimensionality of features. Last step is to predict the load and forecast. For forecasting electricity load and price two existing techniques KNearest Neighbors KNN and MultiLayer Perceptron MLP and a newly proposed technique known as Enhanced KNN EKNN is being used. The proposed technique outperforms than MLP and KNN in terms of accuracy. KNN is working on nonparametric method which is used for classification and regression.',
    'Death Ground.Death Ground is a competitive musical installationgame for two players. The work is designed to provide the framework for the playersparticipants in which to perform gamemediated musical gestures against eachother. The main mechanic involves destroying the other playeru0027s avatar by outmaneuvering and using audio weapons and improvised musical actions against it. These weapons are spawned in an enclosed area during the performance and can be used by whoever is collects them first. There is a multitude of such powerups all of which have different properties such as speed boost additional damage ground traps and so on. All of these weapons affect the sound and sonic textures that each of the avatars produce. Additionally the players can use elements of the environment such as platforms obstructions and elevation in order to gain competitive advantage or position themselves strategically to access first the spawned powerups.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7770, 0.0657],
#         [0.7770, 1.0000, 0.0281],
#         [0.0657, 0.0281, 1.0000]])

Evaluation

Metrics

Triplet

  • Datasets: dblp-aminer-50k-dev and dblp-aminer-50k-test
  • Evaluated with TripletEvaluator
Metric dblp-aminer-50k-dev dblp-aminer-50k-test
cosine_accuracy 1.0 1.0

Training Details

Training Dataset

parquet

  • Dataset: parquet
  • Size: 46,900 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 132 tokens
    • mean: 232.79 tokens
    • max: 384 tokens
    • min: 123 tokens
    • mean: 247.67 tokens
    • max: 384 tokens
    • min: 69 tokens
    • mean: 218.48 tokens
    • max: 384 tokens
  • Samples:
    anchor positive negative
    The longterm effect of media violence exposure on aggression of youngsters.Abstract The effect of media violence on aggression has always been a trending issue and a better understanding of the psychological mechanism of the impact of media violence on youth aggression is an extremely important research topic for preventing the negative impacts of media violence and juvenile delinquency. From the perspective of anger this study explored the longterm effect of different degrees of media violence exposure on the aggression of youngsters as well as the role of aggressive emotions. The studies found that individuals with a high degree of media violence exposure HMVE exhibited higher levels of proactive aggression in both irritation situations and higher levels of reactive aggression in lowirritation situations than did participants with a low degree of media violence exposure LMVE. After being provoked the anger of all participants was significantly increased and the anger and proactive ag... Cyberbullying perpetration and victimization among children and adolescents A systematic review of longitudinal studies.Abstract In this systematic review of exclusively longitudinal studies on cyberbullying perpetration and victimization among adolescents we identified 76 original longitudinal studies published between 2007 and 2017. The majority of them approached middle school students in two waves at 6 or 12 months apart. Prevalence rates for cyberbullying perpetration varied between 5.3 and 66.2 percent and for cyberbullying victimization between 1.9 and 84.0 percent. Personrelated factors e.g. traditional bullying internalizing problems were among the most studied concepts primarily examined as significant risk factors. Evidence on the causal relationships with mediarelated factors e.g. problematic Internet use and environmental factors e.g. parent and peer relations was scarce. This review identified gaps for future longitudinal research on cyberbullying perpetration and victimi... Any small multiplicative subgroup is not a sumset.Abstract We prove that for an arbitrary e u003e 0 and any multiplicative subgroup Γ F p 1 Γ p 2 3 e there are no sets B C F p with B C u003e 1 such that Γ B C . Also we obtain that for 1 Γ p 6 7 e and any ξ 0 there is no a set B such that ξ Γ 1 B B .
    The longterm effect of media violence exposure on aggression of youngsters.Abstract The effect of media violence on aggression has always been a trending issue and a better understanding of the psychological mechanism of the impact of media violence on youth aggression is an extremely important research topic for preventing the negative impacts of media violence and juvenile delinquency. From the perspective of anger this study explored the longterm effect of different degrees of media violence exposure on the aggression of youngsters as well as the role of aggressive emotions. The studies found that individuals with a high degree of media violence exposure HMVE exhibited higher levels of proactive aggression in both irritation situations and higher levels of reactive aggression in lowirritation situations than did participants with a low degree of media violence exposure LMVE. After being provoked the anger of all participants was significantly increased and the anger and proactive ag... Cyberbullying perpetration and victimization among children and adolescents A systematic review of longitudinal studies.Abstract In this systematic review of exclusively longitudinal studies on cyberbullying perpetration and victimization among adolescents we identified 76 original longitudinal studies published between 2007 and 2017. The majority of them approached middle school students in two waves at 6 or 12 months apart. Prevalence rates for cyberbullying perpetration varied between 5.3 and 66.2 percent and for cyberbullying victimization between 1.9 and 84.0 percent. Personrelated factors e.g. traditional bullying internalizing problems were among the most studied concepts primarily examined as significant risk factors. Evidence on the causal relationships with mediarelated factors e.g. problematic Internet use and environmental factors e.g. parent and peer relations was scarce. This review identified gaps for future longitudinal research on cyberbullying perpetration and victimi... Unmanned agricultural product sales system.The invention relates to the field of agricultural product sales provides an unmanned agricultural product sales system and aims to solve the problem of agricultural product waste caused by the factthat most farmers can only prepare goods according to guessing and experiences when selling agricultural products at present. The unmanned agricultural product sales system comprises an acquisition module for acquiring selection information of customers a storage module which prestores a vegetable preparation scheme a matching module which is used for matching a corresponding side dish schemefrom the storage module according to the selection information of the client a pushing module which is used for pushing the matched side dish scheme back to the client an acquisition module which isalso used for acquiring confirmation information of a client an order module which is used for generating order information according to the confirmation information ...
    The longterm effect of media violence exposure on aggression of youngsters.Abstract The effect of media violence on aggression has always been a trending issue and a better understanding of the psychological mechanism of the impact of media violence on youth aggression is an extremely important research topic for preventing the negative impacts of media violence and juvenile delinquency. From the perspective of anger this study explored the longterm effect of different degrees of media violence exposure on the aggression of youngsters as well as the role of aggressive emotions. The studies found that individuals with a high degree of media violence exposure HMVE exhibited higher levels of proactive aggression in both irritation situations and higher levels of reactive aggression in lowirritation situations than did participants with a low degree of media violence exposure LMVE. After being provoked the anger of all participants was significantly increased and the anger and proactive ag... Cyberbullying perpetration and victimization among children and adolescents A systematic review of longitudinal studies.Abstract In this systematic review of exclusively longitudinal studies on cyberbullying perpetration and victimization among adolescents we identified 76 original longitudinal studies published between 2007 and 2017. The majority of them approached middle school students in two waves at 6 or 12 months apart. Prevalence rates for cyberbullying perpetration varied between 5.3 and 66.2 percent and for cyberbullying victimization between 1.9 and 84.0 percent. Personrelated factors e.g. traditional bullying internalizing problems were among the most studied concepts primarily examined as significant risk factors. Evidence on the causal relationships with mediarelated factors e.g. problematic Internet use and environmental factors e.g. parent and peer relations was scarce. This review identified gaps for future longitudinal research on cyberbullying perpetration and victimi... Minimum number of additive tuples in groups of prime order.For a prime number p and a sequence of integers a0 . . . ak 01 . . . p lets a0 . . . ak be the minimum number of k 1tuples x0 . . . xk A0Akwithx0x1xk over subsets a0 . . . AkZp of sizes a0 . . . ak respectively. We observe that an elegant argument of Samotij and Sudakov can be extended to show that there exists an extremal configuration with all sets Ai being intervals of appropriate length. The same conclusion also holds for the related problem posed by Bajnok whena0akaandA0Ak provided k is not equal 1 modulop. Finally by applying basic Fourier analysis we show for Bajnoks problem that if pu003e13 and a 3 . . . p3are fixed whilek1 modp tends to infinity then the extremal configuration alternates between at least two affine nonequivalent sets.
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "mini_batch_size": 16,
        "gather_across_devices": false
    }
    

Evaluation Dataset

parquet

  • Dataset: parquet
  • Size: 5,862 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 132 tokens
    • mean: 225.49 tokens
    • max: 384 tokens
    • min: 124 tokens
    • mean: 240.03 tokens
    • max: 384 tokens
    • min: 69 tokens
    • mean: 221.83 tokens
    • max: 384 tokens
  • Samples:
    anchor positive negative
    Nonlocal Recoloring Algorithm for Color Vision Deficiencies with Naturalness and Detail Preserving.People with Color Vision Deficiencies CVD may have difficulty in recognizing and communicating color information especially in the multimedia era. In this paper we proposed a recoloring algorithm to enhance visual perception of people with CVD. In the algorithm color modification for color blindness is conducted in HSV color space under three constraints detail naturalness and authenticity. A new nonlocal recoloring method is used for preserving details. Subjective experiments were conducted among normal vision subjects and color blind subjects. Experimental results show that our algorithm is robust detail preserving and maintains naturalness. Source codes are freely available to noncommercial users at the website httpsdoi.org10.6084m9.figshare.9742337.v2. Improving Color Discrimination for Color Vision Deficiency CVD with TemporalDomain Modulation.Color Vision Deficiency CVD is often characterized by the inability to distinguish color due to a defective or missing cone in the eye. Although it is possible to modify the observed color to make it easier for users to distinguish this can lead to color confusion with unaffected colors. To address this problem we investigate how flicker can assist distinguishing colors for CVD patients. In preliminary study we evaluated the efficiency of color and brightness modulation with 4 participants with normal vision. Our findings suggests that while brightness modulation was ineffective color modulation can help users distinguish between different colors. Pooled Mining is Driving Blockchains Toward Centralized Systems.The decentralization property of blockchains stems from the fact that each miner accepts or refuses transactions and blocks based on its own verification results. However pooled mining causes blockchains to evolve into centralized systems because pool participants delegate their decisionmaking rights to pool managers. In this paper we established and validated a model for ProofofWork mining introduced the concept of equivalent blocks and quantitatively derived that pooling effectively lowers the income variance of miners. We also analyzed Bitcoin and Ethereum data to prove that pooled mining has become prevalent in the real world. The percentage of poolmined blocks increased from 49.91 to 91.12 within four months in Bitcoin and from 76.9 to 92.2 within five months in Ethereum. In July 2018 Bitcoin and Ethereum mining were dominated by only six and five pools respectively.
    Nonlocal Recoloring Algorithm for Color Vision Deficiencies with Naturalness and Detail Preserving.People with Color Vision Deficiencies CVD may have difficulty in recognizing and communicating color information especially in the multimedia era. In this paper we proposed a recoloring algorithm to enhance visual perception of people with CVD. In the algorithm color modification for color blindness is conducted in HSV color space under three constraints detail naturalness and authenticity. A new nonlocal recoloring method is used for preserving details. Subjective experiments were conducted among normal vision subjects and color blind subjects. Experimental results show that our algorithm is robust detail preserving and maintains naturalness. Source codes are freely available to noncommercial users at the website httpsdoi.org10.6084m9.figshare.9742337.v2. Improving Color Discrimination for Color Vision Deficiency CVD with TemporalDomain Modulation.Color Vision Deficiency CVD is often characterized by the inability to distinguish color due to a defective or missing cone in the eye. Although it is possible to modify the observed color to make it easier for users to distinguish this can lead to color confusion with unaffected colors. To address this problem we investigate how flicker can assist distinguishing colors for CVD patients. In preliminary study we evaluated the efficiency of color and brightness modulation with 4 participants with normal vision. Our findings suggests that while brightness modulation was ineffective color modulation can help users distinguish between different colors. Effects of Brownfield Remediation on Total Gaseous Mercury Concentrations in an Urban Landscape.In order to obtain a better perspective of the impacts of brownfields on the landatmosphere exchange of mercury in urban areas total gaseous mercury TGM was measured at two heights 1.8 m and 42.7 m prior to 20112012 and after 20152016 for the remediation of a brownfield and installation of a parking lot adjacent to the Syracuse Center of Excellence in Syracuse NY USA. Prior to brownfield remediation the annual average TGM concentrations were 1.6 0.6 and 1.4 0.4 ng m 3 at the ground and upper heights respectively. After brownfield remediation the annual average TGM concentrations decreased by 32 and 22 at the ground and the upper height respectively. Mercury soil flux measurements during summer after remediation showed net TGM deposition of 1.7 ng m 2 day 1 suggesting that the site transitioned from a mercury source to a net mercury sink. Measurements from the Atmospheric Mercury Netw...
    Named Entity Recognition for Nepali Language.Named Entity Recognition NER has been studied for many languages like English German Spanish and others but virtually no studies have focused on the Nepali language. One key reason is the lack of an appropriate annotated dataset. In this paper we describe a Nepali NER dataset that we created. We discuss and compare the performance of various machine learning models on this dataset. We also propose a novel NER scheme for Nepali and show that this scheme based on graphemelevel representations outperforms characterlevel representations when combined with BiLSTM models. Our best models obtain an overall F1 score of 86.89 which is a significant improvement on previously reported performance in literature. Enhancing the Performance of Telugu Named Entity Recognition Using Gazetteer Features.Named entity recognition NER is a fundamental step for many natural language processing tasks and hence enhancing the performance of NER models is always appreciated. With limited resources being available NER for SouthEast Asian languages like Telugu is quite a challenging problem. This paper attempts to improve the NER performance for Telugu using gazetteerrelated features which are automatically generated using Wikipedia pages. We make use of these gazetteer features along with other wellknown features like contextual wordlevel and corpus features to build NER models. NER models are developed using three wellknown classifiersconditional random field CRF support vector machine SVM and margin infused relaxed algorithms MIRA. The gazetteer features are shown to improve the performance and theMIRAbased NER model fared better than its counterparts SVM and CRF. Using Inversionmode MOS Varactors and 3port Inductor in 018µm CMOS Voltage Controlled Oscillator.This paper presents a RF voltage controlled oscillator VCO using inversionmode MOS varactors and 3port inductors to achieve low power consumption low phase noise broad tuning range and minimized chip size. The proposed circuit architecture using bodybiased technique operates from 4.3 to 5 GHz with 20.8 tuning range. The measured phase noise is less than 125.34 dBc at a displacement frequency of 1 MHz. The power consumption of this VCO is 25 mW when biased at 1.8 V. This VCO was implemented in standard TSMC 0.18µm 1P6M process. The chip size is 0.476 mm2 including the pads which is only 63 comparing with an identical VCO using TSMC inductor model.
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "mini_batch_size": 16,
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss dblp-aminer-50k-dev_cosine_accuracy dblp-aminer-50k-test_cosine_accuracy
-1 -1 - - 1.0 -
0.2725 100 0.223 0.0166 1.0 -
0.5450 200 0.0699 0.0208 1.0 -
0.8174 300 0.0267 0.0196 1.0 -
-1 -1 - - - 1.0

Framework Versions

  • Python: 3.11.4
  • Sentence Transformers: 5.1.1
  • Transformers: 4.56.2
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.1.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}